A comprehensive taxonomy of hallucinations in Large Language Models

Paper · arXiv 2508.01781 · Published August 3, 2025
FlawsAlignmentPrompts PromptingLinguistics, NLP, NLU

This report provides a comprehensive taxonomy of LLM hallucinations, beginning with a formal definition and a theoretical framework that posits its inherent inevitability in computable LLMs, irrespective of architecture or training. It explores core distinctions, differentiating between intrinsic (contradicting input context) and extrinsic (inconsistent with training data or reality), as well as factuality (absolute correctness) and faithfulness (adherence to input). The report then details specific manifestations, including factual errors, contextual and logical inconsistencies, temporal disorientation, ethical violations, and task-specific hallucinations across domains like code generation and multimodal applications.

• Intrinsic hallucination: intrinsic hallucinations refer to generated text that directly contradicts the provided input or context [7;70]. These errors arise from logical inconsistencies within the generated output itself, without necessarily requiring reference to external knowledge[7]. This type of hallucination reflects the model’s inability to maintain consistency during the inference process or limitations stemming from its internal knowledge and parametric memory[7]. It can also encompass instances where the model misinterprets or omits crucial details from a given document, leading to an inaccurate representation of the source information.[7]

• Extrinsic hallucination: extrinsic hallucinations, conversely, refer to generated text that is not consistent with the training data and ”can neither be supported nor refuted by the input context”[79;7]. This category involves the introduction of entities, facts, or events that do not exist in reality. Such hallucinations frequently occur when models generate novel content or attempt to bridge perceived knowledge gaps[79;7]. This phenomenon highlights the model’s limitations in fully absorbing knowledge from its training data and its inability to accurately recognize the boundaries of its own knowledge. It can also result from issues related to integrating external information or from the model misinterpreting or failing to correctly incorporate the given context or prompt[79;7].

Factuality hallucination: factuality hallucination occurs when an LLM generates ”factually incorrect content”.[42;50] This type of hallucination directly contradicts ”real-world knowledge” or ”established verification sources”. It pertains to the ”absolute correctness of the content generated” when compared against verifiable information. These errors often arise due to the model’s limited contextual understanding and the inherent noise or inaccuracies present in its training data, leading to responses that are not grounded in reality[13;50].

Faithfulness hallucination: faithfulness errors occur when the model’s output ”diverges from the input prompt or provided context”[64;96;61]. The response generated by the model may be internally consistent and appear plausible, but it fails to align with the user’s expectations or the specific information explicitly provided in the input. This type of hallucination is closely related to, and often overlaps with, intrinsic hallucination, as both deal with inconsistencies relative to the given source[64;96;61].

Theorem 1: computably enumerable LLMs will hallucinate For any computably enumerable set of LLMs, there exists a computable ground truth function f such that all states of all LLMs in that set will hallucinate. All currently proposed polynomial-time bounded LLMs are inherently prone to hallucination; it cannot be completely eliminated.
Theorem 2: LLMs will hallucinate on infinitely many questions For any computably enumerable set of LLMs, there exists a computable ground truth function f such that all states of all LLMs in that set will hallucinate on infinitely many inputs. Hallucinations are not isolated incidents but a persistent challenge across a vast range of inputs for any LLM.
Theorem 3: any computable LLM will hallucinate For any individual computable LLM, there exists a computable ground truth function f such that every state of that LLM will hallucinate. Furthermore, for any computable LLM, there exists another f’ such that every state will hallucinate on infinitely many inputs. This generalizes inevitability to any specific LLM, confirming that current and future LLMs will always exhibit some form of hallucination.
Corollary 1: inability to self-eliminate hallucination All computable LLMs cannot prevent themselves from hallucinating. LLMs cannot solely rely on internal mechanisms (e.g., self-correction, chain-of-thought prompting) to eliminate hallucination; external safeguards are essential.

4 Specific categories and manifestations of hallucinations

This section details various specific types of hallucinations, including factual errors, contextual inconsistencies, and task-specific manifestations. Beyond the core intrinsic/extrinsic and factuality/faithfulness distinctions, LLM hallucinations manifest in numerous specific forms, often with distinct characteristics and implications (see summary in Table 2).

4.1 Factual errors and fabrications

This is a prevalent and particularly dangerous type of LLM hallucination, characterized by the generation of incorrect, misleading, or entirely fabricated factual content, frequently presented with a high degree of confidence. Such errors can appear as inaccuracies in historical information, scientific facts, or biographical details[14].

4.1.1 Incorrect facts

These are claims that directly oppose established and verified information[14;6]. An example is Google Bard’s hallucination claiming the James Webb Space Telescope took the first images of an exoplanet, despite NASA’s records indicating that earlier images existed. Other instances include the assertion that ”The Great Wall of China is visible from space” or the statement that ”Thomas Edison invented the internet”.

4.1.2 Fabricated entities/information This involves the invention of historical figures, events, or specific details that do not exist in reality.

This can extend to creating entirely fictitious narratives, such as a claim about ”unicorns in Atlantis” being documented in 10,000 BC. In legal contexts, this type of hallucination can be particularly severe, involving the fabrication of information, including fake quotes and citations of non-existent court cases, leading to significant professional and legal consequences[72]. Similarly, in medical contexts, models may fabricate clinical details, invent research citations, or create made-up disease details, posing substantial risks to patient care[51;12].

4.1.3 Adversarial attacks

A specific subset of factual errors arises from adversarial attacks, where deliberately or inadvertently fabricated details embedded within user prompts lead the model to produce or elaborate on false information. This phenomenon can result in a ”garbage in, garbage out” problem, where erroneous inputs propagate misleading outputs, and also presents a threat of malicious misuse, where bad actors could exploit LLMs to spread falsehoods[51;99;108].

4.2 Contextual inconsistencies

Contextual inconsistencies occur when the model’s output includes information not present in the provided context or directly contradicts it. This type of hallucination is often referred to as ”context divergence” or ”contextual misalignment” , indicating the model’s difficulty in correctly attending to relevant context and instead relying on its internal generative tendencies. An example is when the model is given the context: ”The Nile originates in Central Africa,” but responds with: ”The Nile originates in the mountain ranges of Central Africa,” adding incorrect details not found in the original input[42;4;27].

4.3 Instruction inconsistencies/deviation

Instruction inconsistencies refer to instances where the LLM ignores or fails to follow specific instructions provided by the user. The generated response, in these cases, does not adhere to the user’s explicit directives. For example, if instructed to translate a question into Spanish, the model might instead provide the answer in English[101].

4.4 Logical inconsistencies

Logical inconsistencies manifest when the model’s output contains internal logical errors or contradictions, even if the initial part of the response is correct. This can appear as selfcontradictory statements within the same output or across different interaction instances. This type of hallucination is related to ”erroneous inference hallucination” and accounts for a notable portion, specifically 19%, of identified hallucination cases[42;47;34;95]. An example is an LLM performing an arithmetic operation incorrectly within a step-by-step mathematical solution, or stating a fact in one sentence and then providing a conflicting statement later in the same response

4.5 Temporal disorientation

Temporal disorientation describes a type of hallucination involving issues with time-sensitive information, leading to the generation of outdated, anachronistic, or temporally incorrect facts. LLMs are particularly noted for struggling with ”intricate temporal features” and out-ofdistribution knowledge related to time. This category accounts for 12% of identified hallucination cases.[47;51] An illustrative example is an LLM incorrectly asserting that ”Haruki Murakami won the Nobel Prize in Literature in 2016,” when in fact, he has not won the Nobel Prize.

4.6 Ethical violations

Ethical violations refer to hallucinations that result in harmful, defamatory, or legally incorrect content. These instances can have severe real-world consequences, impacting individuals’ reputations, causing financial losses, or leading to legal repercussions. Ethical violations represent 6% of hallucination cases in some analyses.[47;40;31]

4.6.1 Defamation/misinformation

Examples include ChatGPT falsely claiming a university professor made sexually suggestive comments and attempted to touch a student, citing a non-existent article[42;18]. Another case involved ChatGPT falsely accusing a mayor of bribery and imprisonment, when he was actually a whistleblower.

4.6.2 Financial misinformation

An AI chatbot providing incorrect refund information to a customer, resulting in financial loss for both the customer and the airline, exemplifies how hallucinations can lead to tangible economic harm[102].

4.6.3 Legal inaccuracies

LLMs can produce content that deviates from actual legal facts, well-established legal principles, or precedents. This includes generating ”bogus judicial decisions, bogus quotes, and bogus internal citations”. Such errors can lead to ”representational harm,” where the contributions of one member of the legal community are systematically erased or misattributed[51;19].

4.7 Amalgamated hallucinations

Amalgamated hallucinations occur when the model incorrectly combines multiple facts or conditions presented within a single prompt. This happens when the LLM fails to properly integrate several distinct conditions, resulting in a blended output that erroneously merges disparate pieces of information[27;105].

4.8 Nonsensical responses

Nonsensical responses are instances where LLMs generate output that is completely irrelevant to the input prompt. This type highlights the model’s limitations in understanding context or maintaining a logical thread in a conversation, posing significant challenges in user interaction scenarios where clarity and relevance are paramount. [42] An example is a conversation about the NBA Commissioner where the LLM initially mentions ”Adam Silver” but then randomly switches to ”Stern” in the same response.

4.9 Task-specific hallucinations

Hallucinations can manifest uniquely depending on the specific generative task the LLM is performing.

4.9.1 Dialogue history-based hallucination

This occurs when an LLM mixes up names or relations of entities from the conversation history, or creates new incorrect inferences based on previous errors, leading to a ”snowball effect” of distorted context. This arises because LLMs rely on pattern recognition and statistics, often lacking common sense or factual grounding in dialogue[100;26].

4.9.2 Abstractive summarization hallucination

Systems designed for abstractive summarization can introduce errors or semantic transformations between the original and generated data, distorting or fabricating details, inferring unsupported causal relationships, or retrieving unrelated background knowledge. This is attributed to their reliance on pattern recognition rather than true comprehension of the source text[100;44;64].

4.9.3 Generative question answering hallucination

In this context, the LLM makes an erroneous inference from its source information, leading to an incorrect answer, even when relevant source material is provided. The model may ignore evidence and make unjustified inferences based on its own prior knowledge[100;92].

4.9.4 Code generation hallucination

When generating source code, LLMs can produce incorrect, nonsensical, or unjustifiable code that is difficult to identify and fix, especially under specific execution paths. This undermines the trustworthiness of generated code and can introduce significant risks and errors into codebases. Existing surveys classify these into input-conflicting, context-conflicting, and factconflicting types[57;2].

4.9.5 Multimodal large language models hallucination

In multimodal large language models (MLLMs), hallucinations primarily focus on the ”discrepancy between generated text response and provided visual content,” a phenomenon known as cross-modal inconsistency[38;98]. Object hallucination in MLLMs is empirically categorized into three types:

• Category: identifies nonexistent or incorrect object categories in a given image[38;98].

• Attribute: emphasizes incorrect descriptions of objects’ attributes (e.g., color, shape, material)[38;98].

• Relation: assesses incorrect relationships between objects[38;98].