AI Governance Library

A Comprehensive Taxonomy of Hallucinations in Large Language Models (Universitat de Barcelona, August 2025)

A formal, systemic deep dive into the inevitability of LLM hallucinations, presenting a layered taxonomy, causes, metrics, and mitigation approaches.
A Comprehensive Taxonomy of Hallucinations in Large Language Models (Universitat de Barcelona, August 2025)

⚡ Quick Summary

This report by Manuel Cossio offers a highly structured, formal, and comprehensive framework for understanding hallucinations in Large Language Models (LLMs). It argues that hallucinations are not merely a bug—but a theoretically inevitable feature of computable LLMs. The taxonomy presented categorizes hallucinations into core dimensions (intrinsic vs extrinsic; factuality vs faithfulness) and maps their manifestations across specific tasks (e.g., summarization, code generation, multimodal outputs). The document also breaks down hallucination causes (data, model, prompt), explores human cognitive biases affecting hallucination perception, surveys benchmarks (e.g., TruthfulQA, HalluLens), and proposes mitigation strategies—both architectural (e.g., RAG, Toolformer) and systemic (e.g., human-in-the-loop evaluation, uncertainty displays). It’s a cornerstone resource for researchers and practitioners aiming to design safer and more trustworthy LLM systems.

🧩 What’s Covered

1. Formal Framework and Inevitability

The report builds on computability theory to prove that hallucination is unavoidable in LLMs. It introduces a formal definition where any computable LLM h diverges from a ground truth function f for some input, no matter the training corpus or architecture. The core theorems (T1–T3) argue hallucinations are universal, frequent, and inescapable across all LLM states.

2. Core Taxonomy: Intrinsic vs Extrinsic; Factuality vs Faithfulness

Using a layered model (see Page 1 diagram), the report categorizes hallucinations:

  • Intrinsic: internally inconsistent with input (e.g., temporal or logical contradiction).
  • Extrinsic: fabricated content not in input or reality.
  • Factuality: contradicts real-world facts.
  • Faithfulness: deviates from input context or instruction.

3. Specific Manifestations

The report lists and defines 14+ hallucination types:

  • Factual errors (e.g., incorrect biographical claims).
  • Instruction deviation (ignoring directives).
  • Temporal disorientation (outdated claims).
  • Amalgamated errorsnonsensical outputsmultimodal inconsistencies.Each is illustrated with examples (e.g., “The Parisian Tiger was hunted to extinction” as extrinsic fabrication).

4. Root Causes

Hallucinations are attributed to:

  • Data issues: outdated, biased, or noisy sources.
  • Model design: autoregressive generation, overconfidence, lack of reasoning.
  • Prompting: adversarial input, ambiguity, confirmatory bias.

5. Human Cognitive Factors

The report identifies how biases (e.g., automation biasfluency heuristics) and user overtrust amplify hallucination risks. These are compounded by LLM overconfidence and stylistic polish, which obscure factual inaccuracy.

6. Mitigation Strategies

  • Architectural: Toolformer, Retrieval-Augmented Generation (RAG), adversarial fine-tuning.
  • Systemic: UI-level mitigations (uncertainty displays, source-grounding), symbolic guardrails, fallback policies.

7. Benchmarks and Metrics

Surveys key datasets:

  • TruthfulQAFActScoreHalluLens (taxonomy-aware).
  • Domain-specific (e.g., MedHalluCodeHaluEval).And metrics:
  • SummaCFactCCQuestEvalRAEKILT.Notably, human evaluation remains gold standard.

8. Monitoring Tools

The report reviews real-time hallucination tracking via:

  • Vectara Hallucination Leaderboard,
  • Epoch AI Dashboard (links hallucination reduction with training compute),
  • LM Arena (user-driven head-to-head evaluations, real-world trust signals).

💡 Why it matters?

This report reframes hallucination not as an error to fix—but as a design constraint inherent to current LLM paradigms. It equips AI developers, safety researchers, and regulators with the conceptual rigor and empirical tools needed to move beyond cosmetic solutions. By anchoring hallucination within computability theory, it sets clear boundaries for mitigation, grounding the conversation in what’s feasible, not just desirable. Its layered taxonomy and mapping of hallucination types to causes and metrics enable more effective task-specific and domain-specific responses. In a landscape where LLMs are deployed in high-stakes domains (law, medicine, finance), this report helps operationalize trustworthy AI through layered safeguards and continuous monitoring.

❓ What’s Missing

  • Unified standard: While it discusses multiple benchmarks and metrics, the field still lacks a single evaluation framework harmonized across tasks and hallucination types.
  • Mitigation efficacy: The report outlines mitigation strategies but stops short of deeply analyzing their relative performance or long-term viability.
  • Case studies: Real-world deployment examples or user impact studies would bolster its applied relevance.
  • Causal reasoning mechanisms: It notes the lack of LLM causal reasoning but doesn’t explore architectural proposals to fill that gap.

👥 Best For

  • AI safety researchers designing hallucination detection and mitigation protocols.
  • LLM developers and architects exploring ways to reduce risk and increase reliability.
  • Product teams deploying LLMs in regulated or high-stakes environments.
  • Policymakers drafting LLM oversight frameworks.
  • Academics and PhD students in NLP, HCI, and cognitive science studying trust and interpretability.

📄 Source Details

  • TitleA Comprehensive Taxonomy of Hallucinations in Large Language Models
  • Author: Manuel Cossio, MMed, MEng
  • Institution: Universitat de Barcelona
  • Date: August 2025
  • arXiv ID: [2508.01781v1 [cs.CL]]
  • Length: 56 pages
  • Primary References: Xu et al. (2024) on inevitability, HalluLens (2025), FActScore (2023), RAG architectures, Vectara leaderboard, Toolformer (2023), and Med-PaLM 2 design patterns.

📝 Thanks to

  • Ziwei Xu et al. for the formal proof framework
  • Yejin Bang and team for HalluLens benchmark
  • Epoch AI, Vectara, LM Arena for real-time hallucination tracking platforms
  • Timo Schick et al. for Toolformer
  • Stephanie Lin et al. for TruthfulQA and uncertainty displays research
About the author
Jakub Szarmach

AI Governance Library

Curated Library of AI Governance Resources

AI Governance Library

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Governance Library.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.