Language Understanding and Pragmatics

Why do language models accept false assumptions they know are wrong?

Explores why LLMs fail to reject false presuppositions embedded in questions even when they possess correct knowledge about the topic. This matters because it reveals a grounding failure distinct from knowledge deficits.

Note · 2026-02-21 · sourced from Natural Language Inference
Where exactly does language competence break down in LLMs? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

The FLEX Benchmark study presents one of the clearest findings about LLM grounding behavior: models do not systematically reject misinformation even when they possess accurate knowledge. The finding is more troubling than "LLMs don't know things" — they fail to correct things they demonstrably know.

The setup: LLMs were asked both direct knowledge questions ("Is it true that party X supports Y?") and loaded questions that embedded false presuppositions via factive verbs ("Did voters resent the fact that party X supports Y?" — where the presupposition is false). Models that answered direct questions correctly — demonstrating knowledge — still frequently accommodated the false presupposition in the loaded version rather than rejecting it.

Results: GPT-4 achieved the best rejection rate at 84.08% — still far below the ideal 100%. Mistral achieved only 2.44% rejection, actively amplifying false information at a 91.51% rate. Llama fell in between at ~50% rejection. Most revealing: even with strong correct knowledge, accommodation remained prevalent. The bar representing the lowest grounding score in the weak-belief group was twice as high as the bar for the highest grounding score in the strong-belief group — meaning false knowledge produced more accommodation than correct knowledge produced rejection.

This has a specific implication: the failure is not a knowledge problem. Models know the correct facts. The failure is at the level of grounding behavior — detecting false presuppositions, flagging them, and initiating correction rather than accommodation. Since Why do language models avoid correcting false user claims?, the issue is conversational strategy, not factual competence.

The political domain makes this especially consequential. False presuppositions are efficient misinformation carriers — they introduce beliefs as background assumptions rather than direct claims, and accommodation means accepting them without scrutiny.


Source: Natural Language Inference

Related concepts in this collection

Concept map
15 direct connections · 177 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

llms fail to reject false presuppositions even when knowledge is present