OpenwebUI

Introduction

Open-WebUI is a user-friendly, open-source, self-hosted web interface for interacting with large language models (LLMs). It supports both local models (via Ollama, etc.) and cloud APIs (such as OpenAI).

Key Features

  • User-Friendly Interface: Offers a graphical interface similar to ChatGPT, simplifying interaction with AI models without requiring technical expertise.

  • Self-Hosted & Offline: Can run entirely offline, providing enhanced privacy and security by keeping user data local.

  • Model Support: Integrates with LLM runners like Ollama and can connect to various OpenAI-compatible APIs.

  • Built-in Tools: Includes features for RAG (Retrieval-Augmented Generation) and can be extended with tools for web browsing, image generation, and more.

  • Extensible: Supports plugins, functions, and pipelines to add new AI model integrations and customize workflows.

  • Security and Permissions: Provides granular control over user access and permissions to ensure a secure environment.

What did we do?

Integrated Aiceberg with Open WebUI’s conversational chatbot use case where we analyze the workflow under the hood involving user-agent, agent-model (RAG: not fully observable yet) and agent-LLM events in both listen and enforce modes.

We achieved this by implementing the Pipes feature provided by the Open WebUI framework.

What is a Pipe ?

Pipes are standalone functions that process inputs and generate responses, possibly by invoking one or more LLMs or external services before returning results to the user. Examples of potential actions you can take with Pipes are Retrieval Augmented Generation (RAG), sending requests to non-OpenAI LLM providers (such as Anthropic, Azure OpenAI, or Google), or executing functions right in your web UI. Pipes can be hosted as a Function or on a Pipelines server. (reference: https://docs.openwebui.com/pipelines/pipes)

Important Open-webUI use cases

  • Direct user to UI interaction without any RAG, Tools and only with the model.

  • User to UI interaction with RAG

  • User to UI interaction with tools.

How Our Custom Pipeline / Pipe Works

1

Detect interaction type

Determine whether the interaction is direct, RAG, tool selection, or tool output.

2

Run user input safety check via AIceberg

Perform a safety check and possibly block early.

3

Redact input if needed

Apply redaction to sensitive content before proceeding.

4

Monitor “agent → model” payload

Capture system prompt and context and send to AIceberg.

5

Call the LLM

Invoke the configured LLM (OpenAI or Anthropic).

6

Monitor model response via AIceberg

Inspect the model response and possibly block or redact.

7

Mirror / double-check final response

Before returning to the user, perform a final check — possibly block or replace with a block message. Also ensure outlet logic stores only redacted content in the DB.

We have implemented a Pipe Function called Custom AIceberg Monitor. It defines Valves for parameters (model provider, target model, block message, monitoring profile). The pipe(...) method implements the main flow above. We also have an outlet logic to ensure that any stored logs / messages in the DB are redacted content, not raw content.

Dataflow observed for RAG enabled chatbot use case

1

Select pipeline

User selects Aiceberg integrated custom pipeline (AiceMonitor) as the model for the current chat.

2

User prompt

User sends a prompt in the chat box. If RAG is needed, the user selects the knowledge document to provide as context.

3

RAG component retrieves context

The prompt enters the RAG component which retrieves top relevant chunks. An in-built template formatter concatenates the system instruction, user prompt, and the extracted RAG output into one instruction ready for the custom pipeline.

4

AiceMonitor sends to Aiceberg EAP

The formatted instruction enters AiceMonitor which is sent to Aiceberg’s EAP according to the mode — listen or enforce.

Note: Since this use case runs within one environment and doesn’t make external calls, the custom pipeline is the complete backend. Both listen and enforce modes are nearly equally powerful at controlling input/output flow here, except for differences in whether the LLM response is obtained via a direct LLM call or through Aiceberg’s EAP. In scenarios with additional capabilities, listen mode would be less powerful because it only receives copies of input/output and enforcing changes in-flow is more complex.

Different events

  • Currently able to observe user_agent and agent_llm events.

  • We couldn’t intercept the exact point where RAG happens; agent_llm event contains the output of agent_mem as well. We will investigate whether the framework allows observation of a distinct RAG event or if the current behavior is the only possibility.

  • If the user query is flagged as malicious, with the current flow we can interrupt the flow before the LLM is called.

OpenwebUI RAG with AIceberg monitoring (LISTEN mode)

Configurations

Why do we need it?

We used Open WebUI configuration settings to make the custom pipeline flexible enough to adapt to any monitoring profile/target LLM selected by the user — meaning we build one Aiceberg pipeline that can work for any profile the user selects (for EAP signal settings + target LLM in enforce mode) and for target LLM selection (in listen mode).

How did we do?

We configured the input variables of the pipeline using Valves. Valves are configurable parameters that allow you to control and customize pipeline, filter, and tool behavior. They function like settings or "knobs" that influence how data flows and is processed without modifying core code.

Configuring different Model providers/Models in valves

  • monitoringMode: listen

  • modelProvider: openai, modelName: gpt-4

  • modelProvider: anthropic, modelName: claude-sonnet-3.5

  • monitoringProfile: <profile_id>

Based on valve settings, the pipeline automatically applies these settings and executes the flow.

Follow-up

  • Clean up the monitoring logs to display proper order of operations according to the flow.

  • Investigate and observe the agent_mem event.

How to use

1

Add the pipe to OpenWebUI

Add the custom_aiceberg_pipe.py to the pipelines in your OpenWebUI environment. (See link about how to do this.) This enables Aiceberg to LISTEN.

2

Apply Valve Settings

Apply the Aiceberg valve configuration to define the monitoring profile(s) to be applied by Aiceberg.

Last updated