Hallucination is Inevitable: An Innate Limitation of Large Language Models

Paper · arXiv 2401.11817 · Published January 22, 2024

In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all of the computable functions and will therefore always hallucinate. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally,

For example, in the survey paper [29], the authors attribute hallucination in natural language generation to heuristic data collection, innate divergence, imperfect representation learning, erroneous decoding, exposure bias, and parametric knowledge bias. A plethora of methods have been proposed to mitigate hallucination.

Up to now, research on LLM hallucination remains largely empirical. Useful as they are, empirical studies cannot answer the fundamental question: can hallucination be completely eliminated? The answer to this question is fundamental as it indicates a possible upper limit of LLMs’ abilities. How-ever, since it is impossible to empirically enumerate and test every possible input, formal discussion on this question is impossible without a clear definition and formal analysis of hallucination. In the real world, formally defining hallucination, a factual or logical error of LLM, turns out to be extremely difficult. This is because a formal definition of semantics in the real world is still an open problem [12, 58]. Hence in this work, we rigorously define a formal world of computable functions, wherein precise discussions on hallucination is feasible. In this world, hallucination occurs whenever an LLM fails to exactly reproduce the output of a computable function. Under this definition, we present a fundamental result that hallucination is inevitable for any computable LLM, regardless of model architecture, learning algorithms, prompting techniques, or training data. Since this formal world is a part of the real world, the result also applies to LLMs in the real world.