AI Academy by Astra Technology Solutions — Become AI-fluent. Before the market decides you're not.

Every executive I talk to has the same question dressed up five different ways. Why does my LLM hallucinate when I feed it the quarterly report. Why does it forget what we said three prompts ago. Why does it confidently invent a number that does not exist in the file I just uploaded.

The answer lives in three concepts most people conflate. Context windows. RAG. MCP. Get these straight and most of your frustration goes away.

A context window is the working memory of an LLM. Claude Opus 4.6 holds around 200K tokens. Gemini 3.1 Pro pushes 2M. Sounds infinite. It is not. Chroma research published last year called it context rot. The longer your context gets, the worse the model performs at retrieving from the middle of it. So stuffing a 400 page policy document into one prompt is not a strategy. It is the reason your answers feel mushy.

RAG, retrieval augmented generation, is the workaround. You chunk your documents, store them as vectors, and let the model pull only the relevant slices at query time. This is what most enterprise AI search products do under the hood. It works. It also has a ceiling. RAG is great for static documents. It struggles with anything live, anything that changes, anything that requires the model to act on a system instead of just read it.

That is where MCP comes in. Model Context Protocol, opened up by Anthropic, is becoming the standard way LLMs talk to live tools and data. Think of it as USB for AI. Your CRM, your ticketing system, your data lake, your internal wiki, all expose themselves through one consistent protocol. The model queries them in real time. Menlo Ventures pegged enterprise MCP adoption north of 60 percent of new AI deployments this year, and that number is climbing.

So what should a leader actually do with this. Three things.

One. Stop pasting giant documents into chat. If your team is doing this, they are fighting context rot every day. Build or buy RAG for the static stuff.

Two. Map your data into two buckets. Reference data that rarely changes. Operational data that changes daily. RAG handles the first. MCP handles the second. Both feed the same agent.

Three. Treat hallucinations as a system design problem, not a model problem. The model is not lying. It is filling gaps in bad context. Better retrieval, smaller relevant chunks, and live tool access fix more hallucinations than swapping LLMs ever will.

Most enterprise AI failures I see are not failures of intelligence. They are failures of plumbing. Get the plumbing right and the same model that embarrassed you last quarter starts looking like a senior analyst.

Sources: Chroma context rot research, Menlo Ventures State of Generative AI in the Enterprise 2025, Anthropic MCP documentation.

MCPs vs RAG, Context Windows, and How to Stop Your AI From Making Things Up on Big Data

More from Field Notes