OpenAI Agents SDK

This guide explains how we built simple monitoring for AI agents using the OpenAI Agents SDK. We send every important action to AIceberg for safety checks before continuing.

What is the OpenAI Agents SDK

Understanding what an agent does and why we need to monitor it

The OpenAI Agents SDK helps you build AI agents that can think through problems and use tools. Unlike a simple chatbot that just responds, an agent:

  • Reads your question, Thinks about what information it needs, Uses tools to get that information, Thinks again about the answer and ultimately give you a final response

Because the agent does many things automatically, we need to watch what it does at every step to make sure nothing bad happens.

How Monitoring Works with Hooks

Hooks are checkpoints where the agent tells us what it is about to do

The SDK gives us special functions called hooks. Think of hooks like alarm bells that ring at important moments. When the bell rings, we can check if everything is safe before letting the agent continue.

The SDK has six hooks:

  • on_agent_start — Agent starts (initialization): Nothing (just log agent name)

  • on_llm_start — Before asking the AI model: User question + what we send to AI

  • on_llm_end — After the AI responds: What the AI decided to do

  • on_tool_start — Before using a tool: Is the tool call safe

  • on_tool_end — After the tool finishes: Is the tool result safe

  • on_agent_end — Before showing user the answer: Is the final answer safe

What We Monitor at Each Hook

Details about what information we send to AIceberg at each checkpoint

1

on_agent_start — Agent Initializes

What we get from the SDK:

  • context — current execution context

  • agent — the agent object with name, tools, and instructions

What we do:

on_agent_start
async def on_agent_start(self, context, agent):
    # Just save agent name, no API call needed
    self.state["agent"] = agent.name
    print("Agent started")

Notes:

  • We do NOT send anything to AIceberg here. We do not have the user question yet. We just remember the agent name for later. The user question comes in the next hook.

2

on_llm_start — Before Asking AI Model (THIS IS WHERE WE GET USER QUESTION)

What we get from the SDK:

  • context — current state

  • agent — the agent object with tools

  • system_prompt — instructions for the AI

  • input_items — conversation history (THIS HAS THE USER QUESTION)

What we do:

on_llm_start
async def on_llm_start(self, context, agent, system_prompt, input_items):
    # Extract user question from input_items
    user_text = ""
    for item in input_items:
        if isinstance(item, dict) and "content" in item:
            user_text += str(item["content"]) + "\n"
    user_text = user_text.strip()

    # First time only: send user question to AIceberg
    if "user_event_id" not in self.state:
        event_id = send_to_aiceberg(
            type="user_agt",
            input=user_text
        )
        self.state["user_event_id"] = event_id

    # Every time: build structured data from agent object
    llm_input = {
        "name": agent.name,
        "handoff_description": agent.handoff_description,
        "tools": [extract_tool_info(tool) for tool in agent.tools],
        "instructions": system_prompt,
        "input_items": input_items
    }

    event_id = send_to_aiceberg(
        type="agt_llm",
        input=llm_input
    )
    self.state["llm_event_id"] = event_id

Why we build it this way:

  • We extract user question from input_items because that is where the SDK puts it

  • We get agent metadata from the agent object because it has all the configuration

  • We extract tool schemas from agent.tools to show AIceberg what tools are available

  • We include full input_items for complete conversation history

We send TWO events here: user question (first time only, extracted from input_items) and LLM input (every time, built from agent object)

Why Two Separate Events?

  • user_agt event — Focuses on user to agent interaction (checks if user is asking something harmful)

  • agt_llm event — Focuses on agent to LLM interaction (checks what we send to the AI model)

This separation allows different policies and clearer blocking points.

3

on_llm_end — After AI Model Responds

What we get from the SDK:

  • context — has usage stats

  • agent — agent object

  • response — what the AI decided

What we do:

on_llm_end
async def on_llm_end(self, context, agent, response):
    # Send what the AI responded with
    output = str(response.output)

    send_to_aiceberg(
        type="agt_llm",
        output=output,
        link=self.state["llm_event_id"]
    )

Notes:

  • We use response.output to get what the AI decided

  • We link it to the input event using the event_id we saved earlier

  • We do not send usage stats because AIceberg does not need them

4

on_tool_start — Before Using a Tool

What we get from the SDK:

  • context — ToolContext object with everything we need

  • agent — agent object

  • tool — the tool object

What we do:

on_tool_start
async def on_tool_start(self, context, agent, tool):
    # Extract directly from context
    event_id = send_to_aiceberg(
        type="agt_tool",
        input={
            "tool_name": context.tool_name,
            "tool_call_id": context.tool_call_id,
            "tool_arguments": context.tool_arguments
        }
    )
    self.state["tool_event_id"] = event_id

Why we use context here:

  • ToolContext already has tool_name, tool_call_id, and tool_arguments

  • No need to extract from the tool object

  • The framework already structured it perfectly for us

We do not send usage stats here because they are not needed for safety checks.

5

on_tool_end — After Tool Finishes

What we get from the SDK:

  • context — ToolContext object

  • agent — agent object

  • tool — the tool object

  • result — what the tool returned

What we do:

on_tool_end
async def on_tool_end(self, context, agent, tool, result):
    # Extract from context and result
    send_to_aiceberg(
        type="agt_tool",
        output={
            "tool_name": context.tool_name,
            "tool_call_id": context.tool_call_id,
            "result": str(result)
        },
        link=self.state["tool_event_id"]
    )

Notes:

  • We still use context for tool_name and tool_call_id

  • We add the result parameter to show what the tool returned

  • We link it back to the tool input using the saved event_id

6

on_agent_end — Final Answer

What we get from the SDK:

  • context — final state

  • agent — agent object

  • output — the final answer

What we do:

on_agent_end
async def on_agent_end(self, context, agent, output):
    # Send final answer to user
    final = str(output)

    send_to_aiceberg(
        type="user_agt",
        output=final,
        link=self.state["user_event_id"]
    )

Notes:

  • We use the output parameter directly

  • We link back to the original user question using the saved event_id

  • This completes the full circle from question to answer

When We Use Context vs Agent Object

Understanding why we get data from different places at different times

We Use Agent Object When:

  • We need agent metadata like name and instructions

  • We need the list of available tools

  • We need tool schemas with parameters

Example:

extract tools from agent
tools_data = []
for tool in agent.tools:
    tools_data.append({
        "name": tool.name,
        "description": tool.description,
        "params_json_schema": tool.params_json_schema
    })

We Use Context When:

  • We need structured data the framework already prepared

  • For tools: tool_name, tool_call_id, tool_arguments are ready

  • No extra work needed to extract the data

Example:

extract tool context
# ToolContext already has everything
tool_name = context.tool_name
tool_call_id = context.tool_call_id
tool_arguments = context.tool_arguments

Why We Filter Out Usage Stats:

The context object has usage information like token counts and costs. We do not send this to AIceberg because:

  • AIceberg checks for safety, not cost tracking

  • Usage stats cannot be harmful

  • Keeping payloads smaller makes everything faster

How AIceberg Responds

What happens after we send an event to AIceberg

After we send each event, AIceberg sends back a response like:

{
  "event_id": "01K750H9DHKS5X26C2K9BXP5XY",
  "event_result": "passed",
  "status": "finished"
}

We check the event_result:

  • If "passed" — everything is safe, continue

  • If "blocked" — stop immediately, raise error

When something is blocked, we stop right there. The agent does not continue. The user gets an error message instead of a dangerous response.

Example: Complete Run

Following one question through all 8 checkpoints

User asks: "What is 10 plus 5?"

1

Step 1 — on_agent_start

What We Send: Nothing (just log) Result: Continue

2

Step 2 — on_llm_start

What We Send: User question + Agent metadata Result: Passed

3

Step 3 — on_llm_end

What We Send: AI wants to call "add" tool Result: Passed

4

Step 4 — on_tool_start

What We Send: add(10, 5) Result: Passed

5

Step 5 — on_tool_end

What We Send: Result: 15 Result: Passed

6

Step 6 — on_llm_start (again)

What We Send: Agent metadata + updated conversation Result: Passed

7

Step 7 — on_llm_end

What We Send: AI final answer text Result: Passed

8

Step 8 — on_agent_end

What We Send: "10 plus 5 is 15" Result: Passed

All checks passed, so the user gets their answer safely.

How to Use the Monitor

Simple code to add monitoring to your agent

Basic usage:

basic usage
from aiceberg_monitor import AicebergMonitor
from agents import Runner, Agent

# Create monitor
monitor = AicebergMonitor()

# Create your agent
agent = Agent(
    name="MyAgent",
    instructions="You are helpful",
    tools=[...]
)

# Run with monitoring
result = await Runner.run(
    agent,
    "What is the weather?",
    **monitor.attach(agent)
)

The monitoring happens automatically. You do not need to change your agent code at all.

What About the Logging Version

We have two versions of the monitor

There are two files:

  • aiceberg_monitor.py — Simple monitoring only

  • aiceberg_monitor_with_logging.py — Same monitoring + saves to JSON file

The logging version does the exact same monitoring. It just also saves everything to a file so you can debug and review what happened later. The logging is for your own debugging, not for AIceberg.

Using the logging version:

using logging version
from aiceberg_monitor_with_logging import AicebergMonitor

monitor = AicebergMonitor(save_to_file='monitor_log.json')
result = await Runner.run(agent, "question", **monitor.attach(agent))

This creates a file with all events and responses for you to look at later.

Important Settings

Configuration that makes everything work

Environment Variables

The monitor needs these environment variables set:

AICEBERG_API_KEY=your_api_key
AB_monitoring_profile_U2A=profile_for_user_agent
AB_monitoring_profile_A2M=profile_for_agent_llm
AB_monitoring_profile_A2T=profile_for_agent_tool

Event ID Linking

We save event IDs when we send input events. When we send the matching output event, we include that event ID to link them together. This helps AIceberg understand which input and output go together.

What We Do Not Send

Information we skip to keep payloads clean

We do not send:

  • Token usage and billing data

  • Model version and technical details

  • Internal execution IDs

  • Framework implementation details

  • Timing and performance data

We only send information that could be harmful or violate policies. Everything else is left out to keep monitoring focused and fast.

Summary

The main points about how monitoring works

  • We use hooks to check safety at 8 points for every user question

  • We send structured data to AIceberg using the simplest approach

  • We use agent object when we need metadata and tool schemas

  • We use context object when the framework already structured the data for us

  • We filter out usage stats and technical details

  • If AIceberg blocks something, we stop immediately

  • The monitoring is transparent — no changes to agent code needed

The whole system is designed to be simple and easy to understand. We do not do any complex processing. We just take data from the right places and send it to AIceberg in a clean format.

Example: Actual Payloads and Responses

Looking at the actual data sent to AIceberg from a real test run ("What is 10 plus 5?")

Event 1: User Question (Input)

Type: user_agt | Direction: input

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "user_agt",
"forward_to_llm": false,
"input": "What is 10 plus 5?"
}

AIceberg responded:

{
"event_id": "01K7Q5A0D33RY8MJ1BBVS8NSD1",
"event_result": "passed",
"input_token_count": 5
}
Event 2: Agent to LLM (Input)

Type: agt_llm | Direction: input

What we sent (structured agent data):

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"input": {
    "name": "MathAgent",
    "handoff_description": "None",
    "tools": [
      {
        "name": "add",
        "description": "Adds two numbers.",
        "params_json_schema": {
          "properties": {
            "a": {"title": "A", "type": "integer"},
            "b": {"title": "B", "type": "integer"}
          },
          "required": ["a", "b"],
          "title": "add_args",
          "type": "object"
        },
        "strict_json_schema": "True",
        "is_enabled": "True"
      },
      {
        "name": "multiply",
        "description": "Multiplies two numbers.",
        "params_json_schema": {
          "properties": {
            "a": {"title": "A", "type": "integer"},
            "b": {"title": "B", "type": "integer"}
          },
          "required": ["a", "b"],
          "title": "multiply_args",
          "type": "object"
        },
        "strict_json_schema": "True",
        "is_enabled": "True"
      }
    ],
    "instructions": "You are a helpful math assistant.",
    "input_items": [
      {"content": "What is 10 plus 5?", "role": "user"}
    ]
}
}

AIceberg responded:

{
"event_id": "01K7Q5ABGB2GN22ZM13G9ZER01",
"event_result": "passed",
"input_token_count": 95
}
Event 3: LLM Response (Output)

Type: agt_llm | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"output": "[ResponseFunctionToolCall(arguments='{\"a\":10,\"b\":5}', call_id='call_ZVZpvrPicP0S1lRHEIvzjXS7', name='add', type='function_call')]",
"event_id": "01K7Q5ABGB2GN22ZM13G9ZER01"
}

AIceberg responded:

{
"event_result": "passed"
}

(Notice we linked this output to the input using event_id)

Event 4: Tool Call (Input)

Type: agt_tool | Direction: input

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_tool",
"forward_to_llm": false,
"input": {
    "tool_name": "add",
    "tool_call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
    "tool_arguments": "{\"a\":10,\"b\":5}"
}
}

AIceberg responded:

{
"event_id": "01K7Q5AQYWFR50NGZY3B7495F8",
"event_result": "passed",
"input_token_count": 6
}
Event 5: Tool Result (Output)

Type: agt_tool | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_tool",
"forward_to_llm": false,
"output": {
    "tool_name": "add",
    "tool_call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
    "result": "15"
},
"event_id": "01K7Q5AQYWFR50NGZY3B7495F8"
}

AIceberg responded:

{
"event_result": "passed"
}
Event 6: Agent to LLM Again (Input)

Type: agt_llm | Direction: input

What we sent (now with tool call history):

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"input": {
    "name": "MathAgent",
    "handoff_description": "None",
    "tools": [...same tools...],
    "instructions": "You are a helpful math assistant.",
    "input_items": [
      {"content": "What is 10 plus 5?", "role": "user"},
      {
        "arguments": "{\"a\":10,\"b\":5}",
        "call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
        "name": "add",
        "type": "function_call",
        "status": "completed"
      },
      {
        "call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
        "output": "15",
        "type": "function_call_output"
      }
    ]
}
}

AIceberg responded:

{
"event_id": "01K7Q5B36RSYV6FEJZ3BQN88YE",
"event_result": "passed",
"input_token_count": 113
}
Event 7: LLM Final Response (Output)

Type: agt_llm | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"output": "[ResponseOutputMessage(content=[ResponseOutputText(text='10 plus 5 is 15.', type='output_text')])]",
"event_id": "01K7Q5B36RSYV6FEJZ3BQN88YE"
}

AIceberg responded:

{
"event_result": "passed"
}
Event 8: Final Answer to User (Output)

Type: user_agt | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "user_agt",
"forward_to_llm": false,
"output": "10 plus 5 is 15.",
"event_id": "01K7Q5A0D33RY8MJ1BBVS8NSD1"
}

AIceberg responded:

{
"event_result": "passed"
}

Last updated