Why LLMs hallucinate, and what to actually do about it

An LLM hallucinates when it produces confident text that sounds plausible but isn't true: a fake citation, a function name that doesn't exist, a quote that was never said, a fact that's almost-but-not-quite right. It's the most-discussed weakness of modern AI, and it isn't going away — it's a property of how the technology fundamentally works.

Why hallucination is structural

Recall the one-sentence definition of an LLM: it predicts the next token based on the context. The training process pushes the model toward predicting plausible text — text that sounds like it belongs in the kind of document being completed.

The model doesn't have a concept of "truth." It has a concept of "this kind of thing usually comes after this kind of context." If you ask for a citation for a real-but-obscure law, the model will produce something that looks like a real legal citation. Sometimes that citation actually exists. Sometimes the title is real but the year is wrong. Sometimes the whole thing is invented but formatted perfectly.

This isn't a bug in the model's truthiness module — there is no truthiness module. The model is doing exactly what it was trained to do: produce plausible text. "Plausible" and "true" overlap most of the time, but not always.

When hallucination happens most

Four patterns that increase hallucination risk:

Specific facts the model wasn't trained on. Recent events, niche topics, your private data. The model will fill in plausibly. Mitigation: RAG, web search tools.

Numerical precision. Statistics, citation page numbers, version numbers, timestamps. The model is bad at exact recall. Mitigation: verification step, source checking.

Adjacent-topic confusion. Asking about "the React 19 useTransition hook" — if the model knows about earlier React but not 19, it'll combine them plausibly. Mitigation: explicit version anchoring, RAG against current docs.

Leading questions. "What did Yann LeCun say about Anthropic in his March 2024 interview?" — if there was no such interview, the model will frequently invent one anyway. Mitigation: model-instruction to refuse if unsure (works partially).

What doesn't fix hallucination

Three popular but misleading "fixes":

"Just use a bigger model." Bigger models hallucinate less on common topics but still hallucinate on edge cases. The trend is improvement, not elimination. Don't bet on the next model release fixing your reliability problem.

"Just fine-tune on the truth." Fine-tuning teaches patterns, not facts. You can fine-tune a model to mimic the style of citation but not to actually know which citations are real. For factual grounding, you need retrieval at runtime.

"Just add 'don't hallucinate' to the prompt." Helps a little. Models trained recently are noticeably better at saying "I don't know" when asked. But don't trust the prompt alone for high-stakes decisions.

What actually reduces hallucination

Four mitigations that pay off:

RAG (Retrieval-Augmented Generation). Look up relevant docs at runtime, paste them into the prompt, instruct the model to answer only based on retrieved content. This is the industry standard for AI features that need factual grounding. Hallucination drops dramatically when the model has the right text in front of it.

Citations and verification. Require the model to cite sources for any factual claim, then check the cited sources programmatically. Modern Claude / GPT-5 / Gemini do this natively in research modes.

Tool use for facts. Don't ask the model what time it is — give it a get_time tool. Don't ask for the current price of Bitcoin — give it a price-lookup tool. Whenever a fact has a definitive source, expose the source as a tool.

Refusal training and uncertainty. Newer models are explicitly trained to recognize when they're uncertain and either say so or ask clarifying questions. Make use of this — system prompts can encourage "if you're not sure, say so."

How to live with the residual hallucination

Even with all mitigations, hallucination is a tail risk, not a fixed problem. Your design needs to assume some output will be wrong. Strategies:

Layer human review at the points where wrong answers are most expensive (final draft of a contract, customer-facing reply, decision that costs money).
Show your sources. If the user can verify, hallucinations get caught and don't propagate.
Log and audit. Track outputs against feedback so you can identify patterns of hallucination and fix the cause (prompt, retrieval, model).
Don't promise 100% accuracy. Product copy that says "AI gets things wrong sometimes — verify important facts" is honest and trust-building. Copy that promises perfection turns one hallucination into a lawsuit.

Where hallucination is OK

A productive frame: hallucination matters when wrong = expensive. It matters less when:

You're brainstorming ideas (false positives are fine, you'll filter)
You're getting a starting draft you'll heavily edit
You're using AI for inspiration, not facts
The model's output is checked by someone who knows the truth

It matters a lot when:

Output goes directly to customers, regulators, or the public
Output drives automated decisions (financial, medical, legal)
Verification is hard for the user (they don't know if it's right)

When NOT to obsess over hallucination

If your app is creative writing, brainstorming, or first-draft generation, perfect factual accuracy isn't the point. Spending engineering effort to reduce 5% hallucination to 4% on "write me a poem about Tuesday" is wasted. Focus mitigations where wrongness has a cost.