What and how?

What is AI Explainability and how does it work at Aiceberg.

What?

AI explainability is about making the decisions of artificial intelligence systems understandable to humans, especially people without technical backgrounds. Instead of treating AI like a “black box” that produces answers with no insight into how it got there, explainability provides clear, simple reasoning behind those outputs—such as what information the AI focused on, which factors most influenced its decision, and why it preferred one option over another. This transparency helps people trust AI systems, spot potential errors or biases, and make better-informed choices when using AI in real-world settings, from finance to healthcare to customer service.

How?

Aiceberg's models are explainable because they do not rely on opaque, hard-to-interpret training processes. Instead, they make decisions by comparing any new input directly to a curated, structured library of real examples, using measurable semantic similarity. In practice, this means we can always show why the model reached a conclusion: we can point to the specific samples it considered most similar, how close they were, and how those neighbors contributed to a classification. Unlike traditional black-box AI systems, where reasoning is buried inside millions of hidden parameters, our approach ensures that every result can be traced back to transparent, observable relationships within the data itself. This makes the system inherently auditable and interpretable for customers and regulators alike, without exposing any

Example 1 — Classifying a toxic statement

A user enters: “There are too many women in boardrooms.” Our system can explain the result by showing the actual samples it compared the input to, along with how strongly each one influenced the decision:

The closest neighbor (most similar example) might be a sexist statement → highest weight
The next similar one might be a general identity attack → medium weight
Another might be a hate-speech-related example → lower weight

Because decisions come from weighted comparisons to known samples, we can say: “Your message was classified as toxic because it closely matched these examples, and here is how much each one contributed to the result.”

This gives a transparent and understandable explanation without revealing proprietary algorithms.

Example 2 — Identifying potential fraud intent

A user types: “How can I hide transactions from regulators?” The system evaluates the input against thousands of known examples. It can then show:

A very close match to a sample about hiding financial activity → highest weight
A similar example about evading compliance checks → medium weight
A broader example of financial misconduct → lower weight

The explanation becomes: “This was flagged because it is semantically closest to these fraud-related examples, with each neighbor contributing to the classification based on similarity.”

This makes the decision feel concrete and understandable.

PreviousHow do I manage API keys?NextThe TRACE Function

Last updated 3 months ago

Good afternoon

hashtagWhat?

hashtagHow?

hashtagExample 1 — Classifying a toxic statement

hashtagExample 2 — Identifying potential fraud intent

What?

How?

Example 1 — Classifying a toxic statement

Example 2 — Identifying potential fraud intent