Can pretraining data statistics detect hallucinations better than model confidence?
This explores whether tracking rare entity co-occurrences in training data provides a more reliable hallucination signal than measuring model confidence. It matters because confidence-based retrieval triggers miss the model's most dangerous mistakes.
Adaptive RAG systems decide when to retrieve based on the model's own confidence: if the model is uncertain, fetch external evidence. But confidence is a notoriously bad hallucination signal — models often produce confidently wrong outputs precisely on entities they have seen rarely or never seen together. QuCo-RAG bypasses confidence entirely and uses pretraining-data statistics directly: it checks whether the entities mentioned in a query are rare and, more importantly, whether the specific entity combinations have co-occurred in real data. If a query mentions two entities that the model's training corpus never saw in proximity, that is the retrieval trigger.
The methodological move is replacing an internal symptom (low confidence) with an external cause (data sparsity). Hallucination is what happens when the model interpolates over combinations it never saw; checking pretraining co-occurrence catches the condition before the symptom rather than after. This means QuCo-RAG can flag suspicious outputs even when the model is highly confident, which is the regime where calibration-based methods fail hardest. This stance is in direct tension with When should retrieval happen during model generation?, which treats confidence as the right trigger — see ops/tensions/retrieval trigger signal — pretraining-data statistics vs model uncertainty.md for the full disagreement.
The cost is access to pretraining-data statistics, which is non-trivial for opaque models but tractable for open-weight ones. The deeper implication is that hallucination detection may benefit more from data-side instrumentation than from probing the model's internal states — the training distribution is the ground truth about what the model can reasonably know, and confidence is only a noisy proxy for that.
Source: 12 types of RAG
Related concepts in this collection
-
When should retrieval happen during model generation?
Explores whether retrieval should occur continuously, at fixed intervals, or only when the model signals uncertainty. Standard RAG retrieves once; long-form generation requires dynamic triggering based on confidence signals.
tension: both name the right retrieval trigger but disagree on the signal source — model-internal confidence (FLARE) vs pretraining-data statistics (QuCo-RAG); see ops/tensions/retrieval trigger signal — pretraining-data statistics vs model uncertainty.md
-
Can uncertainty estimation replace complex adaptive retrieval?
Is a simpler approach using model confidence signals sufficient to decide when retrieval is needed, or do complex multi-call adaptive pipelines deliver meaningful benefits?
tension: argues uncertainty IS the efficient signal; QuCo-RAG argues uncertainty is the wrong signal entirely — same trigger problem with mutually inconsistent recommendations
-
Can any computable LLM truly avoid hallucinating?
Explores whether formal theorems prove hallucination is mathematically inevitable for all computable language models, regardless of their design or training approach.
supports: gives the formal reason hallucination cannot be model-side detected; QuCo-RAG accepts this and moves the detection to the data side
-
When should retrieval actually help versus hurt reasoning?
Retrieval augmentation seems universally beneficial, but does it always improve reasoning? This explores whether some reasoning steps benefit from internal knowledge alone, and when external retrieval introduces harmful noise rather than useful information.
extends: another formulation of when-to-retrieve; DeepRAG learns a policy over per-step decisions, QuCo-RAG provides a single principled trigger that policy could use as a feature
-
Does reasoning fine-tuning make models worse at declining to answer?
When models are trained to reason better, do they lose the ability to say 'I don't know'? This matters for high-stakes applications like medical and legal AI that depend on appropriate uncertainty.
supports: another reason to distrust internal-confidence triggers — fine-tuning regimes actively suppress the abstention signal FLARE depends on
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
pretraining-data statistics should trigger retrieval not model confidence — rare entity co-occurrence flags hallucination risk that calibration cannot detect