define rag --plain-english

Illustration for "RAG" from the Non-Technical Technical Dictionary

RAG

TLDR:There are two ways to take a test.

There are two ways to take a test. Walk in and answer from memory, or walk in with the book open, flip to the right page, and answer from that. AI defaults to the first one. RAG is how you hand it the book.

Here's the closed-book problem. A model answers from what it absorbed during training, like a student recalling everything they ever read. Ask it about your stuff, though, and you hit a wall. Your contracts, your prices, last week's numbers, your help docs. None of that was in the books it studied. So it hallucinates. It answers anyway, confidently, and makes up something that sounds right. Confident But Wrong.

RAG flips the test to open-book. It stands for retrieval-augmented generation, which is a mouthful for a simple move. Before the model answers, go fetch the relevant pages from your documents and slip them into the context window first. Then the model answers from the open page in front of it instead of from memory.

I actually showed you the engine for this back when we covered the vector database, without naming the whole thing. The loop where it drops your question on a map of meaning and grabs the nearest chunks of your files? That fetching is the "retrieval" half. RAG is the full sandwich:

  1. You ask a question.

  2. A search runs over your documents and grabs the handful of pieces most relevant to the question, usually by meaning rather than exact words (that's embeddings).

  3. Those pieces get pasted into the prompt, and the model writes its answer from them, citing what it actually found instead of guessing.

That's the whole trick. The model didn't get smarter. It just stopped answering from memory and started looking things up first.

Why this is the pattern behind every "AI that knows my business." The support bot that answers from your help center. The assistant that quotes your own policies back at you. The thing that reads your 200-page contract and finds the one clause that matters. None of those were custom-trained. Someone pointed a normal model at a pile of documents and told it to look before it speaks.

One honest catch, the same one I flagged with the vector database. RAG only answers as well as what it retrieves. Hand it the wrong pages and it'll confidently answer from the wrong pages. If your question lands on something your documents don't actually cover, it'll still grab the closest pages it can find and answer from those, which can be close to nothing. Garbage near your question is still garbage. So when an AI answers from your files and it's subtly off, this is usually where it slipped. It found the closest-looking page, not the correct one.

There's a second way to make AI know your world, and it's the opposite move. You change what's in its head instead of handing it a book. That's fine-tuning, and we'll put the two head to head soon.

Closed-book AI guesses. Open-book AI looks it up first. RAG is the open book.