Amazon Strands Agents

Overview

StrandsAicebergHandler monitors your Strands agent conversations for safety by listening to hook events and sending them to Aiceberg.

Drop it into your agent with one line: hooks=[StrandsAicebergHandler()] and get real-time safety monitoring for user queries, LLM calls, and tool execution.

Why hooks and not callbacks?

Strands gives you two options for watching what your agent is doing:

Callback Handlers are like listeners that respond immediately to everything happening during your agent's execution. They fire constantly as things happenโ€”when the model is thinking, when a tool runs, when output streams to the user. They're lightweight and let you see partial results in real-time.

Hooks are more structured. Instead of listening to everything, they fire at specific lifecycle moments which revolve around agent interactionsโ€”like right before calling the LLM, or right after a tool finishes. They give you organized events at key checkpoints, and more importantly, they can interrupt the agent if something's wrong.

Quick comparison

Callback Handlers
Hooks

What they do

Listen to everything happening in real-time as it happens

Listen to major events before and after they happen

Best for

- Streaming output to your UI - Logging and debugging - Watching things as they happen

- Safety checks and guardrails - Blocking bad content - Enforcing rules

Can they stop the agent?

No - just watch and log

Yes - can stop execution immediately

Structure

Lots of small, unstructured events

Clean, organized lifecycle events

Why we picked hooks for Aiceberg

We need to actually stop bad content from reaching users, not just log it after the fact. Hooks let us check content at key momentsโ€”like right before the agent sends something to the LLM, or right after the LLM respondsโ€”and we can say "nope, stop right there" if something's unsafe. Also, hook events map nicely to Aiceberg's event model: userโ†”agent, agentโ†”LLM, agentโ†”tool. It's a natural fit.

How it works in practice

When your agent is running, hooks fire at important moments. We catch those moments, send the content to Aiceberg for a safety check, and if Aiceberg says "blocked", we throw a SafetyException and the agent stops immediately. The user never sees the unsafe contentโ€”they just get a safe fallback message instead.

Callback handlers are great for watching. Hooks are great for controlling. We need control, so we use hooks.

What we built

A simple hook provider that forwards Strands events to Aiceberg for safety monitoring. Aiceberg can block unsafe events during the flow and hence stop subsequent events like LLM or tool calls from happening, enabling safety and security.

How it works

StrandsAicebergHandler implements Strands' HookProvider interface and listens for six specific events: source - strandsarrow-up-right.

We monitor these events:

  • MessageAddedEvent โ€” When a message is added to the conversation

  • AfterInvocationEvent โ€” After the agent completes its invocation

  • BeforeModelCallEvent โ€” Before calling the LLM

  • AfterModelCallEvent โ€” After the LLM responds

  • BeforeToolCallEvent โ€” Before executing a tool

  • AfterToolCallEvent โ€” After a tool completes

When you register the handler with your agent, Strands automatically calls our callbacks at each critical moment.

Strands provides 8 total hook events, which can be found as available events on strands docsarrow-up-right. Out of these, we currently use six for safety monitoring. The remaining two are not in use:

  • AgentInitializedEvent โ€” Triggered when agent is first constructed (not useful for content safety)

  • BeforeInvocationEvent โ€” Triggered at start of request (we use MessageAddedEvent instead)

User-to-Agent (user_agt)

This is where user inputs enter the system and final responses leave. We monitor two events:

  • MessageAddedEvent (Gate 1): When a message is added to the conversation, we send the raw user utterance to Aiceberg, capture the returned event_id, and block immediately if moderation fails.

  • AfterInvocationEvent (Gate 4): After the agent completes its invocation, we send the final assistant reply to Aiceberg. This is tied back to the Gate 1 event_id. If rejected, we swap the user-facing answer for your fallback.

These two gates bookend the entire conversation turnโ€”what comes in and what goes out.

Agent-to-LLM (agt_llm)

This is where the agent communicates with the language model. We monitor two events:

  • BeforeModelCallEvent (Gate 2): Before calling the LLM, we send the exact messages array Strands will send to the model. This includes system prompts, conversation history, and tool definitions. We capture the event_id and can block if needed.

  • AfterModelCallEvent (Gate 3): After the LLM responds, we send the raw response to Aiceberg. This includes text content and any tool call directives. Both halves are linked via the stored event_id, and either can be blocked.

These gates control what the LLM sees and what it produces.

Agent-to-Tool (agt_tool, agt_mem, A2A)

This is where the agent executes tools. The LLM decides when to call tools based on the user's question. We monitor two events:

  • BeforeToolCallEvent: Before executing a tool, we send the tool name and input parameters to Aiceberg and capture the event_id.

  • AfterToolCallEvent: After the tool completes, we send the result back to Aiceberg, linked to the same event_id.

We don't block tool execution by default โ€” if a tool returns unsafe content, it gets caught at Gate 3 (when LLM processes the result) or Gate 4 (before showing to user). The tool hooks provide audit visibility.

What counts as a tool?

  • Regular tools: Calculator, web search, database queries, API calls โ€” anything the LLM can invoke as a function.

  • Memory operations (Agent-to-memory): Memory uses the same tool hooks. The LLM calls mem0_memory(action="store") or mem0_memory(action="retrieve") just like any other tool. If you set AB_monitoring_profile_A2MEM, they show as agt_mem events instead of agt_tool for a dedicated dashboard view.

  • Agent-to-agent communication (Agent-to-agent): When one agent calls another, it happens through tool calls. According to the Strands documentation, this triggers the same BeforeToolCallEvent and AfterToolCallEvent hooks we already monitor. This needs more extensive testing on our side, but the monitoring pattern is the same. Configure AB_monitoring_profile_A2A if you want A2A calls to show as dedicated events on the dashboard. amazon-strandsarrow-up-right

Can we block tools?

Yes, technically. The hook system allows raising SafetyException at BeforeToolCallEvent. If you add self._check_safety(result, "TOOL_INPUT") after sending to Aiceberg, the tool won't execute.

Why we don't block by default: Blocking tools mid-flight breaks the agent's flow. The LLM expects tool results. If you block a tool, you have three bad options:

  • Send error message as tool result โ†’ Confuses the LLM

  • Send empty/fake data โ†’ Breaks logic

  • Stop entire agent โ†’ User gets incomplete response

Better approach: Log tools for audit visibility but don't block. If a tool returns unsafe content, it gets caught at Gate 3 (when LLM processes the result) or Gate 4 (before showing to user).

When to block tools: If your use case requires it (e.g., preventing database writes), add the safety check. Just handle what happens nextโ€”typically show an error message to the user.

Walking through a query

example: what happens when a user asks.

Step-by-step breakdown

1

User query arrives (Safety Gate 1)

Strands calls on_user_query with a MessageAddedEvent. We grab the user's question and send it to Aiceberg:

Aiceberg receives:

Response includes event_id: "evt_user_123", which we store for later.

2

LLM input prepared (Safety Gate 2)

Your agent builds a prompt with system instructions and the user question. Strands fires BeforeModelCallEvent:

We send the entire messages array to Aiceberg under the agt_llm event type. No extra formattingโ€”just the raw data Strands is sending to the model.

3

LLM responds with tool call (Safety Gate 3)

The model decides it needs to use a calculator tool. Strands fires AfterModelCallEvent:

The LLM's response includes a tool use request. Aiceberg checks it for safety before we proceed.

4

Tool execution (monitoring with full context)

The agent executes the calculator tool. Strands fires BeforeToolCallEvent and AfterToolCallEvent. We log both but don't block:

Tool events are sent to Aiceberg under agt_tool type. We skip the safety check here to keep the agent flow smooth.

Can we block tool calls?

Yes. The hook system allows raising exceptions at BeforeToolCallEvent. The current implementation does not do this by design because blocking tools mid-flight breaks the agentic flow. If your use case requires blocking tools based on Aiceberg moderation, add one line after sending to Aiceberg:

self._check_safety(result, "TOOL_INPUT")

This will raise SafetyException if Aiceberg blocks the tool, preventing execution. You need to handle what happens nextโ€”typically show an error to the user.

5

LLM generates final answer (Safety Gate 3, round 2)

The agent sends the tool result back to the LLM for a final answer. This triggers another BeforeModelCallEvent โ†’ AfterModelCallEvent cycle, with full safety checks both times. The LLM counter increments, so this is tracked as a separate LLM call.

6

Final response to user (Safety Gate 4)

The agent wraps up and returns the answer. Strands fires AfterInvocationEvent:

This is the last safety gate. If everything passes, the answer goes to the user. If blocked, a SafetyException is raised and your app shows a fallback message.

The dashboard view

After this flow completes, you'll see three event types in Aiceberg (all under the same profile if configured that way):

  1. User to Agent (type: user_agt)

    • Input: "What's 10 + 5?"

    • Output: "10 + 5 equals 15."

  2. Agent to LLM (type: agt_llm, two pairs in this case)

    • Pair 1: Input: [messages array] โ†’ Output: [tool use request]

    • Pair 2: Input: [messages with tool result] โ†’ Output: "10 + 5 equals 15."

  3. Agent to Tool (type: agt_tool)

    • Input: {"name": "calculator", "toolUseId": "call_123", "input": {"operation": "add", "a": 10, "b": 5}}

    • Output: {"status": "success", "content": [{"text": "15.0"}]}

Each input/output pair is linked by the event_id, so you can trace a single user question through the entire agent pipeline.

Key design choices

Why raw content with no formatting?

We send exactly what Strands sendsโ€”no prefixes, no labels, no wrapper strings. This keeps Aiceberg's signal clean and makes debugging easier. What you see in the dashboard is exactly what the agent processed.

Why three event types?

Each event type (userโ†”agent, agentโ†”LLM, agentโ†”tool) has different moderation needs. You might allow certain language from users but block it in LLM prompts, or apply stricter policies to final responses. Separate event types give you that flexibility without complicated conditional logic.

Why don't we block tool execution?

Blocking a tool mid-flight can break Strands' event state. The agent expects tools to complete, and interrupting that can leave things in a weird state. Instead, we log tool activity but don't enforce safety there. If a tool returns something unsafe, we'll catch it at Safety Gate 3 (when the LLM processes the result) or Safety Gate 4 (before the final response goes to the user).

This is a design choice, not a technical limitation. The hook system allows raising exceptions at BeforeToolCallEvent. If you raised SafetyException there, the tool would not execute. The current implementation does not block tools because the LLM expects to receive tool results. If you block a tool mid-execution, you have three bad options. Send an error message as the tool result which confuses the LLM. Send empty or fake data which breaks the logic. Or stop the entire agent which leaves the user with an incomplete response. Instead, the design logs tools for observability but does not block them. If a tool returns unsafe content, it gets caught at Safety Gate 3 when the LLM processes the tool result or at Safety Gate 4 before the final response goes to the user. If you need to block tools, add this one line in on_tool_input after sending to Aiceberg: self._check_safety(result, "TOOL_INPUT").

Why forward_to_llm: false?

We're observing the data flow, not proxying it. Strands already handles LLM calls; Aiceberg just gets a copy for monitoring and policy enforcement. If Aiceberg blocks something, we raise an errorโ€”the application decides what to do next.

How memory tools work

Memory tools are just like any other toolโ€”the LLM decides when to use them. What makes them different is what they do:

Storing memories:

Retrieving memories:

What's special about memory:

  • Same tool, different actions โ€” mem0_memory can store OR retrieve, controlled by the action parameter

  • Persistence โ€” Unlike calculator or other stateless tools, memory persists across agent instances

  • Semantic search โ€” Retrieval uses RAG (embeddings + vector search), not exact matching

The LLM orchestrates everythingโ€”when to store, when to retrieve, what search query to use. Just like with calculator, you give it the tool and let it decide.

Where memory shows up in monitoring

Memory operations are tool calls, so they appear in your agt_tool events (or agt_mem if you use the dedicated profile).

Same tool, different actions:

  • mem0_memory(action="store") โ€” Stores a fact

  • mem0_memory(action="retrieve") โ€” Searches with semantic similarity (RAG)

  • mem0_memory(action="list") โ€” Shows all stored memories

Each one triggers BeforeToolCallEvent โ†’ AfterToolCallEvent, just like calculator or any other tool.

What the code does

AicebergMonitor is the simple HTTP client. It loads profile IDs from environment variables, builds the payload, calls base_url/eap/v0/event with your API key, and returns the response. If there's no API key, it returns {"event_result": "passed"} so local dev works without credentials. Network errors also return "passed" so monitoring failures don't break your agent.

StrandsAicebergHandler implements the hook lifecycle. It registers callbacks for six Strands events, extracts content from each one, sends it to AicebergMonitor, checks the result, and maintains event ID mappings for linking inputs to outputs.

_check_safety checks the response from Aiceberg. If event_result is "blocked" or "rejected", it raises SafetyException with a descriptive message. This stops the agent immediately so you can catch the exception and show a safe fallback.

_extract_text_from_content handles Strands' message formats. Messages can be lists of content blocks or simple strings. This helper normalizes everything to plain text for user-facing events.

Quick setup (5 minutes)

Install deps (once):

Add environment variables (.env):

  • AICEBERG_API_KEY=Bearer ...

  • AICEBERG_PROFILE_ID=... (use this for all events)

Or use specific profiles for each event type:

  • AB_monitoring_profile_U2A=... (userโ†”agent events)

  • AB_monitoring_profile_A2M=... (agentโ†”LLM events)

  • AB_monitoring_profile_A2T=... (agentโ†”tool events)

  • AB_monitoring_profile_A2MEM=... (agent โ†” memory => optional, defaults to A2T)

Register the handler:

Run your agent and check the Aiceberg dashboard for events.

Event flow at a glance

Strands hook
Input we send
Output we send
Aiceberg type
Safety check?

MessageAddedEvent

User question

โ€”

user_agt

Gate 1

BeforeModelCallEvent

Messages array

โ€”

agt_llm

Gate 2

AfterModelCallEvent

โ€”

LLM response

agt_llm

Gate 3

BeforeToolCallEvent

Tool use object

โ€”

agt_tool

Log only

BeforeToolCallEvent (memory)

Memory tool call

โ€”

agt_mem**

Optional*

AfterToolCallEvent (memory)

โ€”

Memory result

agt_mem**

Optional*

AfterToolCallEvent

โ€”

Tool result

agt_tool

Log only

AfterInvocationEvent

โ€”

Final answer

user_agt

Gate 4

Optional tool blocking: Tool safety checks are technically possible but not enabled by default. Add self._check_safety(result, "TOOL_INPUT") to block tools. By default, we log for audit but don't blockโ€”unsafe tool outputs get caught at Gate 3 or Gate 4.

Memory event type: Memory operations use agt_mem if you set AB_monitoring_profile_A2MEM in your environment. Otherwise, they appear as agt_tool alongside regular tools.

Logging & observability

Startup banner shows whether credentials were found and which profiles are loaded.

Each event prints a short preview: event type, profile ID (truncated), and whether it's input or output.

Successful sends show Aiceberg's response: "passed", "blocked", or "rejected".

Safety violations raise SafetyException with clear messages like "Content blocked by Aiceberg safety filter at LLM_OUTPUT".

Tool execution is logged but not blocked to avoid breaking agent state.

Safety exceptions surface as SafetyException; wrap your agent calls to show user-friendly messages.

Missing environment variables cause events to skip silently (they return "passed" so your agent keeps working).

Additional Info

The tool events include a feature to capture and monitor the invocation state, which holds metadata about tool calls. This can be used to enhance observability across multi-agent scenarios, enabling more robust coordination patterns.

This was discussed in an issue from Strands Feature divisionarrow-up-right to make related events available.

Memory in Strands

Your agent can remember things across conversations. Strands gives you two ways to do this:

  • FileSessionManager โ€” Simple session persistence. Saves the entire conversation to a file. When you create a new agent with the same session ID, it loads the history. Good for basic chatbots where you just need conversation context.

  • mem0 with vector storage โ€” Smart memory using embeddings. Stores facts in a vector database and retrieves them with semantic search (RAG). The LLM decides what to remember and when to recall it, which is better for complex apps where you need long-term memory.

Last updated