← Case studies
Audit + Deploy

Giving your business a memory

FUNCTION
Operations and Technology
SECTOR
Technology / Software
TIMEFRAME
12 weeks

Most businesses discover they have a knowledge problem in one of two ways. Either someone leaves and takes half the institutional knowledge with them. Or they notice, slowly, that their best people are spending more time answering questions than doing the work they were hired to do. By the time either one is obvious, the cost is already significant.

What we found

When we ran the audit, we expected to find an efficiency problem. We did. But when we dug further, we found something more serious.

THE EFFICIENCY PROBLEM

Nearly a third of all internal queries required a human interrupt. Someone had to stop what they were doing to answer a question that, in theory, the business already had the answer to.

A customer calls with a question. Your support person does not know the answer, so they message a developer. The developer stops what they are doing, finds the answer, and sends it back. Ten minutes gone. Multiply that by fifty interactions a day and you have a business quietly losing time on problems that are already solved, just trapped where nobody can reach them.

THE CONTINUITY RISK

A significant portion of how the business actually worked lived inside the heads of three or four people. Not documented anywhere. Not written down. Just known by the developer who built the critical system four years ago, by the manager who was in the room when the decision was made, by the founder who remembered why things were done a certain way.

If any of those people left, that knowledge walked out with them. And every new hire started from zero, asking the same questions of the same senior people, over and over.

Two problems. One sitting visibly on the surface, costing time every day. One sitting underneath, invisible until something went wrong.

What we built

We connected everything the business had already created (its documentation, its meeting notes, its project history) and made it searchable through a single AI interface. Anyone in the business can now ask a question in plain language and get a direct answer, sourced from the business's own material.

The system does not guess. It does not make things up. It retrieves what already exists and surfaces it instantly.

Critically, it does not replace anyone. People still make every decision. The system just makes sure they have the right information in front of them when they do.

How it works

Five layers, each building on the last. Plain language first, technical detail below.

01
The knowledge base

Everything the business has ever written down (documents, meeting notes, code, support tickets, project updates) gets collected and stored in one place. Think of it as building a library out of everything that already exists in the business.

TECHNICAL

All source documents are ingested and chunked into logical segments. Each chunk is run through an embedding model that converts it into a high-dimensional vector (1024 dimensions). These vectors are stored in a pgvector database alongside the original text, creating a searchable index of the entire knowledge base.

02
Making it searchable

When someone types a question, the system does not search for exact words like Google does. It searches for meaning. So if you ask about "staff leave policy" it will find the document that calls it "employee annual leave" because it understands they mean the same thing.

TECHNICAL

At query time the user's input is embedded using the same model used at ingestion, producing a query vector in the same dimensional space. A similarity search is run against the stored vectors using cosine distance or dot product. The system returns the chunks whose vectors are closest to the query vector: semantic matches rather than keyword matches.

03
Getting the right context before answering

Before the AI even tries to answer your question, the system quietly runs a search in the background. It finds the most relevant information from the knowledge base and loads it into the conversation so the AI has everything it needs before it says a word.

TECHNICAL

User queries are pre-processed by a lightweight model that compresses the input into a clean semantic query string. This is used to perform a RAG lookup against the vector store before the main LLM call. The retrieved chunks are injected into the system prompt as context. By the time the LLM receives the user's question, the relevant source material is already present, reducing the need for tool calls during generation and significantly lowering hallucination risk.

04
The AI answers from what exists, not what it thinks

The AI only answers using information it has been given from the business's own material. It does not guess or fill in gaps from general knowledge. If the answer is not in the knowledge base, it says so and goes looking rather than making something up.

TECHNICAL

The LLM is prompted with explicit instructions to answer only from the injected context. If the pre-loaded context is insufficient, the model is permitted to invoke the RAG tool directly to retrieve additional chunks. This creates a fallback retrieval path while maintaining the efficiency of silent context enrichment as the primary route. The system is model-agnostic. The retrieval and ingestion layers are fully decoupled from the LLM provider, allowing routing to different models by query type or cost tier.

05
Staying current automatically

The knowledge base does not go stale. Every night the system checks for anything new (a meeting that was written up, a document that was updated, a decision that was recorded) and adds it automatically. By the next morning it is in the system and findable.

TECHNICAL

A delta ingestion pipeline runs on a nightly schedule. It checks source systems (Confluence, Git, Jira, or equivalent) for new or modified content since the last ingestion run. Only changed or new documents are re-chunked and re-embedded, keeping compute costs low. Updated vectors replace their predecessors in the store. The full knowledge base does not need to be rebuilt for each update cycle.

What changed

Before

A new hire spent their first two months interrupting senior people to understand how things worked.

After

They ask the system.

Before

A manager needed a status meeting to understand where a project stood.

After

They ask the system.

Before

A support rep escalated half their queries because they did not have access to the answer.

After

They ask the system.

The business did not change. The time it spent looking for what it already knew did. And the knowledge that used to live in a handful of people now lives somewhere the whole business can reach.

The thing most businesses get wrong about AI

They try to replace work before they have made existing work findable. The fastest return on any AI investment is almost always in the layer underneath: making what the business already knows accessible to everyone who needs it, in the moment they need it.

That is where we started here. And it is usually where we start.

Start here

If this sounds like something happening in your business, the audit is where we find out.

It takes two weeks, it is fixed cost, and it will tell you exactly where the time is going.

Get in touch