OpenAI Agents SDK
This guide explains how we built simple monitoring for AI agents using the OpenAI Agents SDK. We send every important action to AIceberg for safety checks before continuing.
What is the OpenAI Agents SDK
Understanding what an agent does and why we need to monitor it
The OpenAI Agents SDK helps you build AI agents that can think through problems and use tools. Unlike a simple chatbot that just responds, an agent:
Reads your question, Thinks about what information it needs, Uses tools to get that information, Thinks again about the answer and ultimately give you a final response
Because the agent does many things automatically, we need to watch what it does at every step to make sure nothing bad happens.
How Monitoring Works with Hooks
Hooks are checkpoints where the agent tells us what it is about to do
The SDK gives us special functions called hooks. Think of hooks like alarm bells that ring at important moments. When the bell rings, we can check if everything is safe before letting the agent continue.
The SDK has six hooks:
on_agent_start — Agent starts (initialization): Nothing (just log agent name)
on_llm_start — Before asking the AI model: User question + what we send to AI
on_llm_end — After the AI responds: What the AI decided to do
on_tool_start — Before using a tool: Is the tool call safe
on_tool_end — After the tool finishes: Is the tool result safe
on_agent_end — Before showing user the answer: Is the final answer safe
What We Monitor at Each Hook
Details about what information we send to AIceberg at each checkpoint
on_agent_start — Agent Initializes
What we get from the SDK:
context — current execution context
agent — the agent object with name, tools, and instructions
What we do:
async def on_agent_start(self, context, agent):
# Just save agent name, no API call needed
self.state["agent"] = agent.name
print("Agent started")Notes:
We do NOT send anything to AIceberg here. We do not have the user question yet. We just remember the agent name for later. The user question comes in the next hook.
on_llm_start — Before Asking AI Model (THIS IS WHERE WE GET USER QUESTION)
What we get from the SDK:
context — current state
agent — the agent object with tools
system_prompt — instructions for the AI
input_items — conversation history (THIS HAS THE USER QUESTION)
What we do:
async def on_llm_start(self, context, agent, system_prompt, input_items):
# Extract user question from input_items
user_text = ""
for item in input_items:
if isinstance(item, dict) and "content" in item:
user_text += str(item["content"]) + "\n"
user_text = user_text.strip()
# First time only: send user question to AIceberg
if "user_event_id" not in self.state:
event_id = send_to_aiceberg(
type="user_agt",
input=user_text
)
self.state["user_event_id"] = event_id
# Every time: build structured data from agent object
llm_input = {
"name": agent.name,
"handoff_description": agent.handoff_description,
"tools": [extract_tool_info(tool) for tool in agent.tools],
"instructions": system_prompt,
"input_items": input_items
}
event_id = send_to_aiceberg(
type="agt_llm",
input=llm_input
)
self.state["llm_event_id"] = event_idWhy we build it this way:
We extract user question from
input_itemsbecause that is where the SDK puts itWe get agent metadata from the
agentobject because it has all the configurationWe extract tool schemas from
agent.toolsto show AIceberg what tools are availableWe include full
input_itemsfor complete conversation history
We send TWO events here: user question (first time only, extracted from input_items) and LLM input (every time, built from agent object)
Why Two Separate Events?
user_agt event — Focuses on user to agent interaction (checks if user is asking something harmful)
agt_llm event — Focuses on agent to LLM interaction (checks what we send to the AI model)
This separation allows different policies and clearer blocking points.
on_llm_end — After AI Model Responds
What we get from the SDK:
context — has usage stats
agent — agent object
response — what the AI decided
What we do:
async def on_llm_end(self, context, agent, response):
# Send what the AI responded with
output = str(response.output)
send_to_aiceberg(
type="agt_llm",
output=output,
link=self.state["llm_event_id"]
)Notes:
We use
response.outputto get what the AI decidedWe link it to the input event using the event_id we saved earlier
We do not send usage stats because AIceberg does not need them
on_tool_start — Before Using a Tool
What we get from the SDK:
context — ToolContext object with everything we need
agent — agent object
tool — the tool object
What we do:
async def on_tool_start(self, context, agent, tool):
# Extract directly from context
event_id = send_to_aiceberg(
type="agt_tool",
input={
"tool_name": context.tool_name,
"tool_call_id": context.tool_call_id,
"tool_arguments": context.tool_arguments
}
)
self.state["tool_event_id"] = event_idWhy we use context here:
ToolContext already has
tool_name,tool_call_id, andtool_argumentsNo need to extract from the tool object
The framework already structured it perfectly for us
We do not send usage stats here because they are not needed for safety checks.
on_tool_end — After Tool Finishes
What we get from the SDK:
context — ToolContext object
agent — agent object
tool — the tool object
result — what the tool returned
What we do:
async def on_tool_end(self, context, agent, tool, result):
# Extract from context and result
send_to_aiceberg(
type="agt_tool",
output={
"tool_name": context.tool_name,
"tool_call_id": context.tool_call_id,
"result": str(result)
},
link=self.state["tool_event_id"]
)Notes:
We still use
contextfortool_nameandtool_call_idWe add the
resultparameter to show what the tool returnedWe link it back to the tool input using the saved event_id
on_agent_end — Final Answer
What we get from the SDK:
context — final state
agent — agent object
output — the final answer
What we do:
async def on_agent_end(self, context, agent, output):
# Send final answer to user
final = str(output)
send_to_aiceberg(
type="user_agt",
output=final,
link=self.state["user_event_id"]
)Notes:
We use the
outputparameter directlyWe link back to the original user question using the saved event_id
This completes the full circle from question to answer
When We Use Context vs Agent Object
Understanding why we get data from different places at different times
We Use Agent Object When:
We need agent metadata like name and instructions
We need the list of available tools
We need tool schemas with parameters
Example:
tools_data = []
for tool in agent.tools:
tools_data.append({
"name": tool.name,
"description": tool.description,
"params_json_schema": tool.params_json_schema
})We Use Context When:
We need structured data the framework already prepared
For tools:
tool_name,tool_call_id,tool_argumentsare readyNo extra work needed to extract the data
Example:
# ToolContext already has everything
tool_name = context.tool_name
tool_call_id = context.tool_call_id
tool_arguments = context.tool_argumentsWhy We Filter Out Usage Stats:
The context object has usage information like token counts and costs. We do not send this to AIceberg because:
AIceberg checks for safety, not cost tracking
Usage stats cannot be harmful
Keeping payloads smaller makes everything faster
How AIceberg Responds
What happens after we send an event to AIceberg
After we send each event, AIceberg sends back a response like:
{
"event_id": "01K750H9DHKS5X26C2K9BXP5XY",
"event_result": "passed",
"status": "finished"
}We check the event_result:
If "passed" — everything is safe, continue
If "blocked" — stop immediately, raise error
When something is blocked, we stop right there. The agent does not continue. The user gets an error message instead of a dangerous response.
Example: Complete Run
Following one question through all 8 checkpoints
User asks: "What is 10 plus 5?"
All checks passed, so the user gets their answer safely.
How to Use the Monitor
Simple code to add monitoring to your agent
Basic usage:
from aiceberg_monitor import AicebergMonitor
from agents import Runner, Agent
# Create monitor
monitor = AicebergMonitor()
# Create your agent
agent = Agent(
name="MyAgent",
instructions="You are helpful",
tools=[...]
)
# Run with monitoring
result = await Runner.run(
agent,
"What is the weather?",
**monitor.attach(agent)
)The monitoring happens automatically. You do not need to change your agent code at all.
What About the Logging Version
We have two versions of the monitor
There are two files:
aiceberg_monitor.py — Simple monitoring only
aiceberg_monitor_with_logging.py — Same monitoring + saves to JSON file
The logging version does the exact same monitoring. It just also saves everything to a file so you can debug and review what happened later. The logging is for your own debugging, not for AIceberg.
Using the logging version:
from aiceberg_monitor_with_logging import AicebergMonitor
monitor = AicebergMonitor(save_to_file='monitor_log.json')
result = await Runner.run(agent, "question", **monitor.attach(agent))This creates a file with all events and responses for you to look at later.
Important Settings
Configuration that makes everything work
Environment Variables
The monitor needs these environment variables set:
AICEBERG_API_KEY=your_api_key
AB_monitoring_profile_U2A=profile_for_user_agent
AB_monitoring_profile_A2M=profile_for_agent_llm
AB_monitoring_profile_A2T=profile_for_agent_toolEvent ID Linking
We save event IDs when we send input events. When we send the matching output event, we include that event ID to link them together. This helps AIceberg understand which input and output go together.
What We Do Not Send
Information we skip to keep payloads clean
We do not send:
Token usage and billing data
Model version and technical details
Internal execution IDs
Framework implementation details
Timing and performance data
We only send information that could be harmful or violate policies. Everything else is left out to keep monitoring focused and fast.
Summary
The main points about how monitoring works
We use hooks to check safety at 8 points for every user question
We send structured data to AIceberg using the simplest approach
We use agent object when we need metadata and tool schemas
We use context object when the framework already structured the data for us
We filter out usage stats and technical details
If AIceberg blocks something, we stop immediately
The monitoring is transparent — no changes to agent code needed
The whole system is designed to be simple and easy to understand. We do not do any complex processing. We just take data from the right places and send it to AIceberg in a clean format.
Example: Actual Payloads and Responses
Looking at the actual data sent to AIceberg from a real test run ("What is 10 plus 5?")
Last updated