I have worked my (more than) fair share in many different regulated environments. Here are some tipps.

Set up a unified knowledge base for yourself (and possibly your entire team) from day one.
- Have one trusted person push all important documents into this location. Then use vectorization and semantic search for your team to find the right document when they need it.
Set up a shared calendar for important dates (like renewing certificates). Remember: notifications alone that periodically and automatically check if a server is about have its certificate expire are easy to miss. These issues can and will ruin your christmas (trust me)
Use docker/podman containers. If possible, generate these using nix for pinned dependencies and reduced overhead.

In this post I would like to delve into the first point: RAGs.

What is a RAG and why should you care

Retrieval-Augmented Generation (RAG) is simple in principle: before you ask a language model something, you first search your own knowledge base for relevant context, and you stuff that context into the prompt. The model answers based on your data, not alone on what it was trained on prior. An example workflow for this would be

flowchart TD
    A[User question] --> B[Vector similarity search over your documents]
    B --> C["Top-k relevant chunks + original question"]
    C --> D[LLM generates a grounded answer]

The interesting part is the “vector similarity search” step. You embed your documents into a high-dimensional vector space using an embedding model, and you store those vectors in a vector database. When a query comes in, you embed the query the same way, and find the nearest neighbours. Nearest in embedding space means “semantically similar”, not just keyword-matching. That is what makes it actually useful: you can ask “how does the certificate rotation work for the ingress controller” and get back the relevant runbook section even though it is titled “TLS lifecycle management for nginx-based ingress”.

To make this intuitive, here is a toy example of what “similar in embedding space” looks like. The words below occupy a 3-D slice of embedding-space – notice how “dog/walk/park” cluster together on one side and “car/drive/highway” on the other, and how the relations between them (dog walks in the park; cars drive on the highway) form parallel geometric paths (meaning we can apply an affine linear distance preserving transformation to the whole space, mapping one “concept”, like walking your dog in the park, to the other “concept”, like driving your car on the highway):

Note

You may skip this part, if you are not interested in the geometric intuition or theory behind embedding spaces. But I think it is really cool, so here we go:

Notice how the red arrows kinda behave similar for the horizontal arrows in a commutative diagram? That is by no means a coincidence!

In fact, the whole point of embedding space is that it is a vector space, so you can do vector arithmetic on it. And \(\cdot + v\) for some vector \(v \in V\) is a morphism of the vector space, meaning an endofunctor, if we consider \(V\) as a category with one object.

Interested about more functors? Maybe check out this post.

For the vector database I used ChromaDB. It is fast and is one of the industry standards. In particular, without going too much into it, there exist some embedding models (like gemini-embedding-001 or text-embedding-3-small or text-embedding-3-large which you may use. For ease of use, I used all-MiniLM-L6-v2 for an MVP, which can easily and without additional API-keys be run locally).

A simple example

Assume you have many different regulatory texts (in my case these were BSI IT-Grundschutz documents, but it could be anything). You will have to read them one by one eventually, just to be sure that you really know what they are about. But, what if you have read them, and then you wish to find out where in the text that one passage about topic \(X\) was?

Well, you may use your RAG (and preferably set it up with MCP, but that’s a topic for a different time), so you can just ask “where in the BSI IT-Grundschutz is the part about certificate management?” (for \(X\) being “certificate management” in this case) and it will give you the relevant passage, even if the document does not explicitly mention “certificate management” but rather talks about “TLS lifecycle management” or something like that.

Why would your company benefit from setting up a RAG?

Does regulation seem like a never-ending nightmare?
Does your company drown in paperwork and nobody seems to know where the important documents are?
Do you have to read the same documents over and over again, just to find that one passage that is relevant to your current problem?
Do you fear that if that one person who knows where the important documents are leaves, you will be in big trouble?

Fear not! I can help you with setting up a RAG. You may schedule a free consultation with me, and I will be happy to help you with that.

📅 Book a free consultation