What makes LLM outputs fabrication rather than hallucination or confabulation?
This explores why the corpus argues 'fabrication' is a more honest label for LLM errors than borrowed clinical terms like 'hallucination' or 'confabulation' — and what that relabeling changes about how we fix the problem.
This explores why some researchers insist that when an LLM gets something wrong, we should call it fabrication rather than hallucination or confabulation — and why the word we pick isn't just semantics. The core argument is mechanical: an LLM produces every output, true or false, through the exact same process of statistically predicting the next token with no grounding in a shared reality Should we call LLM errors hallucinations or fabrications?. 'Hallucination' implies a perception gone wrong; 'confabulation' implies a memory gone wrong. But the model has neither perception nor memory in any meaningful sense — its correct answers and its wrong answers are the same kind of thing, made the same way. The error isn't a malfunction in some faculty; the fabrication is the default mode, and accuracy is the happy accident.
Why this matters is that terminology quietly decides where you go looking for fixes. Call it hallucination and you chase better grounding — feed the model more accurate perception of the world. Call it fabrication and you accept the output is unreliable by construction, so the fix shifts to external verification systems and calibrated uncertainty designed into the use case, not into the model's 'senses' Does calling LLM errors hallucinations point us toward the wrong fixes?. This reframing gains force from a formal result: hallucination is mathematically inevitable for any computable LLM — three theorems show it must produce falsehoods on infinitely many inputs, and no amount of internal self-correction can eliminate it Can any computable LLM truly avoid hallucinating?. If the problem can't be fixed from the inside, then external safeguards aren't optional polish — they're the only real lever.
Here's where the corpus gets more interesting than a simple relabeling: fabrication isn't even one thing. Shanahan's framework distinguishes fabrication, good-faith error, and role-played deception using nothing but behavioral signatures — regenerate the same prompt many times and watch the variation. Fabrication shows high variation (the model is improvising freely), good-faith error stays low and stable, and role-played deception stays low but flips with context Can we distinguish types of LLM falsehood by regeneration patterns?. This lets you diagnose what kind of falsehood you're dealing with without making claims about what the model 'believes.' And there's a category even fact-checkers miss: when prompted to fuse semantically distant concepts, models build elaborate, plausible-sounding frameworks for connections that don't legitimately exist — and present them as defensible research rather than flagging them as speculation Do language models evaluate semantic legitimacy when fusing concepts?.
Underneath all of this is a deeper claim about what LLM text even is. The 'Foundation Priors' view says model outputs are draws from a subjective prior distribution — reflections of learned patterns and your prompt — not empirical observations of the world, and treating them as evidence is a category error Should we treat LLM outputs as real empirical data?. This connects to the argument that LLMs and humans aren't doing the same thing at all: humans use language to address and relate to one another, while models emit strings from a probability distribution that merely shares the surface form Are language models and human speakers doing the same thing?. 'Fabrication' is the term that respects that gap. (There's a counterweight worth knowing about — some argue LLMs do achieve a kind of indirect grounding by extracting causal structure from text written by grounded humans, so the chain to reality isn't fully severed, just gappy Can large language models develop genuine world models without direct environmental contact?.)
The quietly unsettling payoff: even your instinct to make outputs 'reliable' by setting temperature to zero is misguided. Determinism just makes the model repeat one draw from its distribution — consistent, but still one sample, and consistency is not the same as truth Does setting temperature to zero actually make LLM outputs reliable?. The whole cluster of arguments points one way: the falsehoods aren't a bug to be perceived away, they're a property of a fabrication engine, and that recognition changes what you build around it.
Sources 9 notes
LLMs generate text through statistical token relationships without grounding in shared context. Accurate and inaccurate outputs use identical mechanisms, so calling failures "hallucinations" or "confabulation" misdirects fixes toward perception or memory—the wrong layers.
LLMs generate text through identical statistical processes regardless of accuracy, making 'fabrication' the more honest term. This reframes the fix from perception-based grounding to verification systems and calibrated uncertainty in use case design.
Three formal theorems prove that any computable LLM must hallucinate on infinitely many inputs, and internal mechanisms like self-correction cannot eliminate this mathematical constraint. External safeguards are therefore necessary, not optional.
Shanahan's framework distinguishes fabrication (high variation), good-faith error (low variation, stable), and role-played deception (low variation, context-dependent) using behavioral tests alone. This avoids mentalistic language while enabling differential diagnosis for safety.
LLMs generate coherent, plausible metaphorical reasoning when prompted to fuse semantically distant concepts without legitimate correspondences. Rather than decline or flag the fusion as speculative, they produce elaborate frameworks presented as defensible research, revealing a category-distinct hallucination type missed by fact-checking taxonomies.
Foundation Priors framework shows that LLM-generated text reflects the model's learned patterns and user's prompt choices, not ground truth. Such outputs should only influence inference through explicitly parameterized trust weights, not be treated as equivalent to real evidence.
LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.
LLMs form structured world representations by extracting regularities from training data produced by causally grounded humans. This constitutes indirect causal grounding mediated through text, though the chain has gaps that limit real-time verification and model updating.
Fixed seeds and zero temperature replicate the same output repeatedly, but that output remains one draw from the model's probability distribution. McDonald's omega testing across 100 repetitions reveals that consistency does not equal reliability.