OpenAI Agents SDK

This guide explains how we built simple monitoring for AI agents using the OpenAI Agents SDK. We send every important action to AIceberg for safety checks before continuing.

What is the OpenAI Agents SDK

Understanding what an agent does and why we need to monitor it

The OpenAI Agents SDK helps you build AI agents that can think through problems and use tools. Unlike a simple chatbot that just responds, an agent:

Reads your question, Thinks about what information it needs, Uses tools to get that information, Thinks again about the answer and ultimately give you a final response

Because the agent does many things automatically, we need to watch what it does at every step to make sure nothing bad happens.

How Monitoring Works with Hooks

Hooks are checkpoints where the agent tells us what it is about to do

The SDK gives us special functions called hooks. Think of hooks like alarm bells that ring at important moments. When the bell rings, we can check if everything is safe before letting the agent continue.

The SDK has six hooks:

on_agent_start — Agent starts (initialization): Nothing (just log agent name)
on_llm_start — Before asking the AI model: User question + what we send to AI
on_llm_end — After the AI responds: What the AI decided to do
on_tool_start — Before using a tool: Is the tool call safe
on_tool_end — After the tool finishes: Is the tool result safe
on_agent_end — Before showing user the answer: Is the final answer safe

What We Monitor at Each Hook

Details about what information we send to AIceberg at each checkpoint

on_agent_start — Agent Initializes

What we get from the SDK:

context — current execution context
agent — the agent object with name, tools, and instructions

What we do:

on_agent_start

async def on_agent_start(self, context, agent):
    # Just save agent name, no API call needed
    self.state["agent"] = agent.name
    print("Agent started")

Notes:

We do NOT send anything to AIceberg here. We do not have the user question yet. We just remember the agent name for later. The user question comes in the next hook.

on_llm_start — Before Asking AI Model (THIS IS WHERE WE GET USER QUESTION)

What we get from the SDK:

context — current state
agent — the agent object with tools
system_prompt — instructions for the AI
input_items — conversation history (THIS HAS THE USER QUESTION)

What we do:

on_llm_start

async def on_llm_start(self, context, agent, system_prompt, input_items):
    # Extract user question from input_items
    user_text = ""
    for item in input_items:
        if isinstance(item, dict) and "content" in item:
            user_text += str(item["content"]) + "\n"
    user_text = user_text.strip()

    # First time only: send user question to AIceberg
    if "user_event_id" not in self.state:
        event_id = send_to_aiceberg(
            type="user_agt",
            input=user_text
        )
        self.state["user_event_id"] = event_id

    # Every time: build structured data from agent object
    llm_input = {
        "name": agent.name,
        "handoff_description": agent.handoff_description,
        "tools": [extract_tool_info(tool) for tool in agent.tools],
        "instructions": system_prompt,
        "input_items": input_items
    }

    event_id = send_to_aiceberg(
        type="agt_llm",
        input=llm_input
    )
    self.state["llm_event_id"] = event_id

Why we build it this way:

We extract user question from input_items because that is where the SDK puts it
We get agent metadata from the agent object because it has all the configuration
We extract tool schemas from agent.tools to show AIceberg what tools are available
We include full input_items for complete conversation history

We send TWO events here: user question (first time only, extracted from input_items) and LLM input (every time, built from agent object)

Why Two Separate Events?

user_agt event — Focuses on user to agent interaction (checks if user is asking something harmful)
agt_llm event — Focuses on agent to LLM interaction (checks what we send to the AI model)

This separation allows different policies and clearer blocking points.

on_llm_end — After AI Model Responds

What we get from the SDK:

context — has usage stats
agent — agent object
response — what the AI decided

What we do:

on_llm_end

async def on_llm_end(self, context, agent, response):
    # Send what the AI responded with
    output = str(response.output)

    send_to_aiceberg(
        type="agt_llm",
        output=output,
        link=self.state["llm_event_id"]
    )

Notes:

We use response.output to get what the AI decided
We link it to the input event using the event_id we saved earlier
We do not send usage stats because AIceberg does not need them

on_tool_start — Before Using a Tool

What we get from the SDK:

context — ToolContext object with everything we need
agent — agent object
tool — the tool object

What we do:

on_tool_start

async def on_tool_start(self, context, agent, tool):
    # Extract directly from context
    event_id = send_to_aiceberg(
        type="agt_tool",
        input={
            "tool_name": context.tool_name,
            "tool_call_id": context.tool_call_id,
            "tool_arguments": context.tool_arguments
        }
    )
    self.state["tool_event_id"] = event_id

Why we use context here:

ToolContext already has tool_name, tool_call_id, and tool_arguments
No need to extract from the tool object
The framework already structured it perfectly for us

We do not send usage stats here because they are not needed for safety checks.

on_tool_end — After Tool Finishes

What we get from the SDK:

context — ToolContext object
agent — agent object
tool — the tool object
result — what the tool returned

What we do:

on_tool_end

async def on_tool_end(self, context, agent, tool, result):
    # Extract from context and result
    send_to_aiceberg(
        type="agt_tool",
        output={
            "tool_name": context.tool_name,
            "tool_call_id": context.tool_call_id,
            "result": str(result)
        },
        link=self.state["tool_event_id"]
    )

Notes:

We still use context for tool_name and tool_call_id
We add the result parameter to show what the tool returned
We link it back to the tool input using the saved event_id

on_agent_end — Final Answer

What we get from the SDK:

context — final state
agent — agent object
output — the final answer

What we do:

on_agent_end

async def on_agent_end(self, context, agent, output):
    # Send final answer to user
    final = str(output)

    send_to_aiceberg(
        type="user_agt",
        output=final,
        link=self.state["user_event_id"]
    )

Notes:

We use the output parameter directly
We link back to the original user question using the saved event_id
This completes the full circle from question to answer

When We Use Context vs Agent Object

Understanding why we get data from different places at different times

We Use Agent Object When:

We need agent metadata like name and instructions
We need the list of available tools
We need tool schemas with parameters

Example:

extract tools from agent

tools_data = []
for tool in agent.tools:
    tools_data.append({
        "name": tool.name,
        "description": tool.description,
        "params_json_schema": tool.params_json_schema
    })

We Use Context When:

We need structured data the framework already prepared
For tools: tool_name, tool_call_id, tool_arguments are ready
No extra work needed to extract the data

Example:

extract tool context

# ToolContext already has everything
tool_name = context.tool_name
tool_call_id = context.tool_call_id
tool_arguments = context.tool_arguments

Why We Filter Out Usage Stats:

The context object has usage information like token counts and costs. We do not send this to AIceberg because:

AIceberg checks for safety, not cost tracking
Usage stats cannot be harmful
Keeping payloads smaller makes everything faster

How AIceberg Responds

What happens after we send an event to AIceberg

After we send each event, AIceberg sends back a response like:

{
  "event_id": "01K750H9DHKS5X26C2K9BXP5XY",
  "event_result": "passed",
  "status": "finished"
}

We check the event_result:

If "passed" — everything is safe, continue
If "blocked" — stop immediately, raise error

When something is blocked, we stop right there. The agent does not continue. The user gets an error message instead of a dangerous response.

Example: Complete Run

Following one question through all 8 checkpoints

User asks: "What is 10 plus 5?"

Step 1 — on_agent_start

What We Send: Nothing (just log) Result: Continue

Step 2 — on_llm_start

What We Send: User question + Agent metadata Result: Passed

Step 3 — on_llm_end

What We Send: AI wants to call "add" tool Result: Passed

Step 4 — on_tool_start

What We Send: add(10, 5) Result: Passed

Step 5 — on_tool_end

What We Send: Result: 15 Result: Passed

Step 6 — on_llm_start (again)

What We Send: Agent metadata + updated conversation Result: Passed

Step 7 — on_llm_end

What We Send: AI final answer text Result: Passed

Step 8 — on_agent_end

What We Send: "10 plus 5 is 15" Result: Passed

All checks passed, so the user gets their answer safely.

How to Use the Monitor

Simple code to add monitoring to your agent

Basic usage:

basic usage

from aiceberg_monitor import AicebergMonitor
from agents import Runner, Agent

# Create monitor
monitor = AicebergMonitor()

# Create your agent
agent = Agent(
    name="MyAgent",
    instructions="You are helpful",
    tools=[...]
)

# Run with monitoring
result = await Runner.run(
    agent,
    "What is the weather?",
    **monitor.attach(agent)
)

The monitoring happens automatically. You do not need to change your agent code at all.

What About the Logging Version

We have two versions of the monitor

There are two files:

aiceberg_monitor.py — Simple monitoring only
aiceberg_monitor_with_logging.py — Same monitoring + saves to JSON file

The logging version does the exact same monitoring. It just also saves everything to a file so you can debug and review what happened later. The logging is for your own debugging, not for AIceberg.

Using the logging version:

using logging version

from aiceberg_monitor_with_logging import AicebergMonitor

monitor = AicebergMonitor(save_to_file='monitor_log.json')
result = await Runner.run(agent, "question", **monitor.attach(agent))

This creates a file with all events and responses for you to look at later.

Important Settings

Configuration that makes everything work

Environment Variables

The monitor needs these environment variables set:

AICEBERG_API_KEY=your_api_key
AB_monitoring_profile_U2A=profile_for_user_agent
AB_monitoring_profile_A2M=profile_for_agent_llm
AB_monitoring_profile_A2T=profile_for_agent_tool

Event ID Linking

We save event IDs when we send input events. When we send the matching output event, we include that event ID to link them together. This helps AIceberg understand which input and output go together.

What We Do Not Send

Information we skip to keep payloads clean

We do not send:

Token usage and billing data
Model version and technical details
Internal execution IDs
Framework implementation details
Timing and performance data

We only send information that could be harmful or violate policies. Everything else is left out to keep monitoring focused and fast.

Summary

The main points about how monitoring works

We use hooks to check safety at 8 points for every user question
We send structured data to AIceberg using the simplest approach
We use agent object when we need metadata and tool schemas
We use context object when the framework already structured the data for us
We filter out usage stats and technical details
If AIceberg blocks something, we stop immediately
The monitoring is transparent — no changes to agent code needed

The whole system is designed to be simple and easy to understand. We do not do any complex processing. We just take data from the right places and send it to AIceberg in a clean format.

Example: Actual Payloads and Responses

Looking at the actual data sent to AIceberg from a real test run ("What is 10 plus 5?")

Event 1: User Question (Input)

Type: user_agt | Direction: input

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "user_agt",
"forward_to_llm": false,
"input": "What is 10 plus 5?"
}

AIceberg responded:

{
"event_id": "01K7Q5A0D33RY8MJ1BBVS8NSD1",
"event_result": "passed",
"input_token_count": 5
}

Event 2: Agent to LLM (Input)

Type: agt_llm | Direction: input

What we sent (structured agent data):

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"input": {
    "name": "MathAgent",
    "handoff_description": "None",
    "tools": [
      {
        "name": "add",
        "description": "Adds two numbers.",
        "params_json_schema": {
          "properties": {
            "a": {"title": "A", "type": "integer"},
            "b": {"title": "B", "type": "integer"}
          },
          "required": ["a", "b"],
          "title": "add_args",
          "type": "object"
        },
        "strict_json_schema": "True",
        "is_enabled": "True"
      },
      {
        "name": "multiply",
        "description": "Multiplies two numbers.",
        "params_json_schema": {
          "properties": {
            "a": {"title": "A", "type": "integer"},
            "b": {"title": "B", "type": "integer"}
          },
          "required": ["a", "b"],
          "title": "multiply_args",
          "type": "object"
        },
        "strict_json_schema": "True",
        "is_enabled": "True"
      }
    ],
    "instructions": "You are a helpful math assistant.",
    "input_items": [
      {"content": "What is 10 plus 5?", "role": "user"}
    ]
}
}

AIceberg responded:

{
"event_id": "01K7Q5ABGB2GN22ZM13G9ZER01",
"event_result": "passed",
"input_token_count": 95
}

Event 3: LLM Response (Output)

Type: agt_llm | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"output": "[ResponseFunctionToolCall(arguments='{\"a\":10,\"b\":5}', call_id='call_ZVZpvrPicP0S1lRHEIvzjXS7', name='add', type='function_call')]",
"event_id": "01K7Q5ABGB2GN22ZM13G9ZER01"
}

AIceberg responded:

{
"event_result": "passed"
}

(Notice we linked this output to the input using event_id)

Event 4: Tool Call (Input)

Type: agt_tool | Direction: input

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_tool",
"forward_to_llm": false,
"input": {
    "tool_name": "add",
    "tool_call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
    "tool_arguments": "{\"a\":10,\"b\":5}"
}
}

AIceberg responded:

{
"event_id": "01K7Q5AQYWFR50NGZY3B7495F8",
"event_result": "passed",
"input_token_count": 6
}

Event 5: Tool Result (Output)

Type: agt_tool | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_tool",
"forward_to_llm": false,
"output": {
    "tool_name": "add",
    "tool_call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
    "result": "15"
},
"event_id": "01K7Q5AQYWFR50NGZY3B7495F8"
}

AIceberg responded:

{
"event_result": "passed"
}

Event 6: Agent to LLM Again (Input)

Type: agt_llm | Direction: input

What we sent (now with tool call history):

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"input": {
    "name": "MathAgent",
    "handoff_description": "None",
    "tools": [...same tools...],
    "instructions": "You are a helpful math assistant.",
    "input_items": [
      {"content": "What is 10 plus 5?", "role": "user"},
      {
        "arguments": "{\"a\":10,\"b\":5}",
        "call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
        "name": "add",
        "type": "function_call",
        "status": "completed"
      },
      {
        "call_id": "call_ZVZpvrPicP0S1lRHEIvzjXS7",
        "output": "15",
        "type": "function_call_output"
      }
    ]
}
}

AIceberg responded:

{
"event_id": "01K7Q5B36RSYV6FEJZ3BQN88YE",
"event_result": "passed",
"input_token_count": 113
}

Event 7: LLM Final Response (Output)

Type: agt_llm | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "agt_llm",
"forward_to_llm": false,
"output": "[ResponseOutputMessage(content=[ResponseOutputText(text='10 plus 5 is 15.', type='output_text')])]",
"event_id": "01K7Q5B36RSYV6FEJZ3BQN88YE"
}

AIceberg responded:

{
"event_result": "passed"
}

Event 8: Final Answer to User (Output)

Type: user_agt | Direction: output

What we sent:

{
"profile_id": "01XXXXXXXXXXXXXXXXXXXXXXX",
"event_type": "user_agt",
"forward_to_llm": false,
"output": "10 plus 5 is 15.",
"event_id": "01K7Q5A0D33RY8MJ1BBVS8NSD1"
}

AIceberg responded:

{
"event_result": "passed"
}

PreviousAmazon Strands Agents NextLangchain

Last updated 2 days ago

Good night

What is the OpenAI Agents SDK

How Monitoring Works with Hooks

What We Monitor at Each Hook

on_agent_start — Agent Initializes

on_llm_start — Before Asking AI Model (THIS IS WHERE WE GET USER QUESTION)

on_llm_end — After AI Model Responds

on_tool_start — Before Using a Tool

on_tool_end — After Tool Finishes

on_agent_end — Final Answer

When We Use Context vs Agent Object

We Use Agent Object When:

We Use Context When:

Why We Filter Out Usage Stats:

How AIceberg Responds

Example: Complete Run

Step 1 — on_agent_start

Step 2 — on_llm_start

Step 3 — on_llm_end

Step 4 — on_tool_start

Step 5 — on_tool_end

Step 6 — on_llm_start (again)

Step 7 — on_llm_end

Step 8 — on_agent_end

How to Use the Monitor

What About the Logging Version

Important Settings

Environment Variables

Event ID Linking

What We Do Not Send

Summary

Example: Actual Payloads and Responses