Amazon Strands Agents
Overview
StrandsAicebergHandler monitors your Strands agent conversations for safety by listening to hook events and sending them to Aiceberg.
Drop it into your agent with one line: hooks=[StrandsAicebergHandler()] and get real-time safety monitoring for user queries, LLM calls, and tool execution.
Why hooks and not callbacks?
Strands gives you two options for watching what your agent is doing:
Callback Handlers are like listeners that respond immediately to everything happening during your agent's execution. They fire constantly as things happenโwhen the model is thinking, when a tool runs, when output streams to the user. They're lightweight and let you see partial results in real-time.
Hooks are more structured. Instead of listening to everything, they fire at specific lifecycle moments which revolve around agent interactionsโlike right before calling the LLM, or right after a tool finishes. They give you organized events at key checkpoints, and more importantly, they can interrupt the agent if something's wrong.
Quick comparison
What they do
Listen to everything happening in real-time as it happens
Listen to major events before and after they happen
Best for
- Streaming output to your UI - Logging and debugging - Watching things as they happen
- Safety checks and guardrails - Blocking bad content - Enforcing rules
Can they stop the agent?
No - just watch and log
Yes - can stop execution immediately
Structure
Lots of small, unstructured events
Clean, organized lifecycle events
Why we picked hooks for Aiceberg
We need to actually stop bad content from reaching users, not just log it after the fact. Hooks let us check content at key momentsโlike right before the agent sends something to the LLM, or right after the LLM respondsโand we can say "nope, stop right there" if something's unsafe. Also, hook events map nicely to Aiceberg's event model: userโagent, agentโLLM, agentโtool. It's a natural fit.
How it works in practice
When your agent is running, hooks fire at important moments. We catch those moments, send the content to Aiceberg for a safety check, and if Aiceberg says "blocked", we throw a SafetyException and the agent stops immediately. The user never sees the unsafe contentโthey just get a safe fallback message instead.
Callback handlers are great for watching. Hooks are great for controlling. We need control, so we use hooks.
What we built
A simple hook provider that forwards Strands events to Aiceberg for safety monitoring. Aiceberg can block unsafe events during the flow and hence stop subsequent events like LLM or tool calls from happening, enabling safety and security.
How it works
StrandsAicebergHandler implements Strands' HookProvider interface and listens for six specific events: source - strands.
We monitor these events:
MessageAddedEvent โ When a message is added to the conversation
AfterInvocationEvent โ After the agent completes its invocation
BeforeModelCallEvent โ Before calling the LLM
AfterModelCallEvent โ After the LLM responds
BeforeToolCallEvent โ Before executing a tool
AfterToolCallEvent โ After a tool completes
When you register the handler with your agent, Strands automatically calls our callbacks at each critical moment.
Strands provides 8 total hook events, which can be found as available events on strands docs. Out of these, we currently use six for safety monitoring. The remaining two are not in use:
AgentInitializedEventโ Triggered when agent is first constructed (not useful for content safety)BeforeInvocationEventโ Triggered at start of request (we useMessageAddedEventinstead)
User-to-Agent (user_agt)
This is where user inputs enter the system and final responses leave. We monitor two events:
MessageAddedEvent (Gate 1): When a message is added to the conversation, we send the raw user utterance to Aiceberg, capture the returned
event_id, and block immediately if moderation fails.AfterInvocationEvent (Gate 4): After the agent completes its invocation, we send the final assistant reply to Aiceberg. This is tied back to the Gate 1
event_id. If rejected, we swap the user-facing answer for your fallback.
These two gates bookend the entire conversation turnโwhat comes in and what goes out.
Agent-to-LLM (agt_llm)
This is where the agent communicates with the language model. We monitor two events:
BeforeModelCallEvent (Gate 2): Before calling the LLM, we send the exact messages array Strands will send to the model. This includes system prompts, conversation history, and tool definitions. We capture the
event_idand can block if needed.AfterModelCallEvent (Gate 3): After the LLM responds, we send the raw response to Aiceberg. This includes text content and any tool call directives. Both halves are linked via the stored
event_id, and either can be blocked.
These gates control what the LLM sees and what it produces.
Agent-to-Tool (agt_tool, agt_mem, A2A)
This is where the agent executes tools. The LLM decides when to call tools based on the user's question. We monitor two events:
BeforeToolCallEvent: Before executing a tool, we send the tool name and input parameters to Aiceberg and capture the
event_id.AfterToolCallEvent: After the tool completes, we send the result back to Aiceberg, linked to the same
event_id.
We don't block tool execution by default โ if a tool returns unsafe content, it gets caught at Gate 3 (when LLM processes the result) or Gate 4 (before showing to user). The tool hooks provide audit visibility.
What counts as a tool?
Regular tools: Calculator, web search, database queries, API calls โ anything the LLM can invoke as a function.
Memory operations (Agent-to-memory): Memory uses the same tool hooks. The LLM calls
mem0_memory(action="store")ormem0_memory(action="retrieve")just like any other tool. If you setAB_monitoring_profile_A2MEM, they show asagt_memevents instead ofagt_toolfor a dedicated dashboard view.Agent-to-agent communication (Agent-to-agent): When one agent calls another, it happens through tool calls. According to the Strands documentation, this triggers the same
BeforeToolCallEventandAfterToolCallEventhooks we already monitor. This needs more extensive testing on our side, but the monitoring pattern is the same. ConfigureAB_monitoring_profile_A2Aif you want A2A calls to show as dedicated events on the dashboard. amazon-strands
Can we block tools?
Yes, technically. The hook system allows raising SafetyException at BeforeToolCallEvent. If you add self._check_safety(result, "TOOL_INPUT") after sending to Aiceberg, the tool won't execute.
Why we don't block by default: Blocking tools mid-flight breaks the agent's flow. The LLM expects tool results. If you block a tool, you have three bad options:
Send error message as tool result โ Confuses the LLM
Send empty/fake data โ Breaks logic
Stop entire agent โ User gets incomplete response
Better approach: Log tools for audit visibility but don't block. If a tool returns unsafe content, it gets caught at Gate 3 (when LLM processes the result) or Gate 4 (before showing to user).
When to block tools: If your use case requires it (e.g., preventing database writes), add the safety check. Just handle what happens nextโtypically show an error message to the user.
Walking through a query
example: what happens when a user asks.

Step-by-step breakdown
Tool execution (monitoring with full context)
The agent executes the calculator tool. Strands fires BeforeToolCallEvent and AfterToolCallEvent. We log both but don't block:
Tool events are sent to Aiceberg under agt_tool type. We skip the safety check here to keep the agent flow smooth.
Can we block tool calls?
Yes. The hook system allows raising exceptions at BeforeToolCallEvent. The current implementation does not do this by design because blocking tools mid-flight breaks the agentic flow. If your use case requires blocking tools based on Aiceberg moderation, add one line after sending to Aiceberg:
self._check_safety(result, "TOOL_INPUT")
This will raise SafetyException if Aiceberg blocks the tool, preventing execution. You need to handle what happens nextโtypically show an error to the user.
The dashboard view
After this flow completes, you'll see three event types in Aiceberg (all under the same profile if configured that way):
User to Agent (type:
user_agt)Input: "What's 10 + 5?"
Output: "10 + 5 equals 15."
Agent to LLM (type:
agt_llm, two pairs in this case)Pair 1: Input: [messages array] โ Output: [tool use request]
Pair 2: Input: [messages with tool result] โ Output: "10 + 5 equals 15."
Agent to Tool (type:
agt_tool)Input:
{"name": "calculator", "toolUseId": "call_123", "input": {"operation": "add", "a": 10, "b": 5}}Output:
{"status": "success", "content": [{"text": "15.0"}]}
Each input/output pair is linked by the event_id, so you can trace a single user question through the entire agent pipeline.
Key design choices
Why raw content with no formatting?
We send exactly what Strands sendsโno prefixes, no labels, no wrapper strings. This keeps Aiceberg's signal clean and makes debugging easier. What you see in the dashboard is exactly what the agent processed.
Why three event types?
Each event type (userโagent, agentโLLM, agentโtool) has different moderation needs. You might allow certain language from users but block it in LLM prompts, or apply stricter policies to final responses. Separate event types give you that flexibility without complicated conditional logic.
Why don't we block tool execution?
Blocking a tool mid-flight can break Strands' event state. The agent expects tools to complete, and interrupting that can leave things in a weird state. Instead, we log tool activity but don't enforce safety there. If a tool returns something unsafe, we'll catch it at Safety Gate 3 (when the LLM processes the result) or Safety Gate 4 (before the final response goes to the user).
This is a design choice, not a technical limitation. The hook system allows raising exceptions at BeforeToolCallEvent. If you raised SafetyException there, the tool would not execute. The current implementation does not block tools because the LLM expects to receive tool results. If you block a tool mid-execution, you have three bad options. Send an error message as the tool result which confuses the LLM. Send empty or fake data which breaks the logic. Or stop the entire agent which leaves the user with an incomplete response. Instead, the design logs tools for observability but does not block them. If a tool returns unsafe content, it gets caught at Safety Gate 3 when the LLM processes the tool result or at Safety Gate 4 before the final response goes to the user. If you need to block tools, add this one line in on_tool_input after sending to Aiceberg: self._check_safety(result, "TOOL_INPUT").
Why forward_to_llm: false?
We're observing the data flow, not proxying it. Strands already handles LLM calls; Aiceberg just gets a copy for monitoring and policy enforcement. If Aiceberg blocks something, we raise an errorโthe application decides what to do next.
How memory tools work
Memory tools are just like any other toolโthe LLM decides when to use them. What makes them different is what they do:
Storing memories:
Retrieving memories:
What's special about memory:
Same tool, different actions โ
mem0_memorycan store OR retrieve, controlled by theactionparameterPersistence โ Unlike calculator or other stateless tools, memory persists across agent instances
Semantic search โ Retrieval uses RAG (embeddings + vector search), not exact matching
The LLM orchestrates everythingโwhen to store, when to retrieve, what search query to use. Just like with calculator, you give it the tool and let it decide.
Where memory shows up in monitoring
Memory operations are tool calls, so they appear in your agt_tool events (or agt_mem if you use the dedicated profile).
Same tool, different actions:
mem0_memory(action="store")โ Stores a factmem0_memory(action="retrieve")โ Searches with semantic similarity (RAG)mem0_memory(action="list")โ Shows all stored memories
Each one triggers BeforeToolCallEvent โ AfterToolCallEvent, just like calculator or any other tool.
What the code does
AicebergMonitor is the simple HTTP client. It loads profile IDs from environment variables, builds the payload, calls base_url/eap/v0/event with your API key, and returns the response. If there's no API key, it returns {"event_result": "passed"} so local dev works without credentials. Network errors also return "passed" so monitoring failures don't break your agent.
StrandsAicebergHandler implements the hook lifecycle. It registers callbacks for six Strands events, extracts content from each one, sends it to AicebergMonitor, checks the result, and maintains event ID mappings for linking inputs to outputs.
_check_safety checks the response from Aiceberg. If event_result is "blocked" or "rejected", it raises SafetyException with a descriptive message. This stops the agent immediately so you can catch the exception and show a safe fallback.
_extract_text_from_content handles Strands' message formats. Messages can be lists of content blocks or simple strings. This helper normalizes everything to plain text for user-facing events.
Quick setup (5 minutes)
Install deps (once):
Add environment variables (.env):
AICEBERG_API_KEY=Bearer ...AICEBERG_PROFILE_ID=...(use this for all events)
Or use specific profiles for each event type:
AB_monitoring_profile_U2A=...(userโagent events)AB_monitoring_profile_A2M=...(agentโLLM events)AB_monitoring_profile_A2T=...(agentโtool events)AB_monitoring_profile_A2MEM=...(agent โ memory => optional, defaults to A2T)
Register the handler:
Run your agent and check the Aiceberg dashboard for events.
Event flow at a glance
MessageAddedEvent
User question
โ
user_agt
Gate 1
BeforeModelCallEvent
Messages array
โ
agt_llm
Gate 2
AfterModelCallEvent
โ
LLM response
agt_llm
Gate 3
BeforeToolCallEvent
Tool use object
โ
agt_tool
Log only
BeforeToolCallEvent (memory)
Memory tool call
โ
agt_mem**
Optional*
AfterToolCallEvent (memory)
โ
Memory result
agt_mem**
Optional*
AfterToolCallEvent
โ
Tool result
agt_tool
Log only
AfterInvocationEvent
โ
Final answer
user_agt
Gate 4
Optional tool blocking: Tool safety checks are technically possible but not enabled by default. Add self._check_safety(result, "TOOL_INPUT") to block tools. By default, we log for audit but don't blockโunsafe tool outputs get caught at Gate 3 or Gate 4.
Memory event type: Memory operations use agt_mem if you set AB_monitoring_profile_A2MEM in your environment. Otherwise, they appear as agt_tool alongside regular tools.
Logging & observability
Startup banner shows whether credentials were found and which profiles are loaded.
Each event prints a short preview: event type, profile ID (truncated), and whether it's input or output.
Successful sends show Aiceberg's response: "passed", "blocked", or "rejected".
Safety violations raise SafetyException with clear messages like "Content blocked by Aiceberg safety filter at LLM_OUTPUT".
Tool execution is logged but not blocked to avoid breaking agent state.
Safety exceptions surface as SafetyException; wrap your agent calls to show user-friendly messages.
Missing environment variables cause events to skip silently (they return "passed" so your agent keeps working).
Additional Info
The tool events include a feature to capture and monitor the invocation state, which holds metadata about tool calls. This can be used to enhance observability across multi-agent scenarios, enabling more robust coordination patterns.
This was discussed in an issue from Strands Feature division to make related events available.
Memory in Strands
Your agent can remember things across conversations. Strands gives you two ways to do this:
FileSessionManager โ Simple session persistence. Saves the entire conversation to a file. When you create a new agent with the same session ID, it loads the history. Good for basic chatbots where you just need conversation context.
mem0 with vector storage โ Smart memory using embeddings. Stores facts in a vector database and retrieves them with semantic search (RAG). The LLM decides what to remember and when to recall it, which is better for complex apps where you need long-term memory.
Last updated