How does the Question Under Discussion shape what counts as presupposed?

This explores how the Question Under Discussion (QUD) — the implicit question a conversation is currently trying to answer — decides which parts of a sentence get treated as taken-for-granted background rather than as the actual claim being made.

This explores how the Question Under Discussion — the implicit question a conversation is trying to answer at any moment — controls what slips into the background as presupposed rather than what's actively asserted. The corpus has a clean answer to this: what counts as presupposed isn't fixed by the words themselves but by what the conversation is currently asking. The sharpest evidence comes from work showing that projection is gradient, not binary — across 19 English expressions, the same trigger word projects (i.e., survives as background assumption) more or less depending on whether its content addresses the live question, not on any fixed property of the word Does projection strength vary by context or by word type?. Content that isn't 'at-issue' for the current QUD is exactly the content that gets quietly presupposed. So the QUD is the switch: it sorts each piece of a sentence into 'this is the point' versus 'this is assumed.'

That reframes presupposition from a dictionary fact into a conversational move. One note makes this dual nature explicit: presuppositions have two origins — some are baked into trigger words lexically, but others arise through accommodation, where listeners silently update the shared context to make a mismatched utterance make sense Do language models miss presuppositions that arise from context?. Accommodation only works because there's a QUD to resolve against; you absorb the assumption because rejecting it would derail the question on the table. This is also why it's a quietly powerful persuasion tool — presenting a new claim as presupposed background lets it bypass the scrutiny an open assertion would invite, since the QUD isn't pointed at it Why are presuppositions more persuasive than direct assertions?.

The most revealing material, though, comes from watching machines fail at this. Language models treat presupposition triggers as surface cues rather than computing what they mean against the discourse — embedding verbs and triggers become 'blinds' that systematically corrupt their entailment predictions Why do embedding contexts confuse LLM entailment predictions?. More tellingly, models accommodate false presuppositions even when they demonstrably know the facts are wrong: GPT-4 rejects them only 84% of the time, some models barely at all, and performance roughly halves on questions carrying false assumptions Why do language models accept false assumptions they know are wrong? Why do language models struggle with questions containing false assumptions?. The diagnosis is that they miss conversationally derived presuppositions by design — pattern-matching trigger words can't substitute for tracking the question under discussion Do language models miss presuppositions that arise from context?.

What you might not expect to learn here: this is the same shape as the frame problem. Models stumble not from lacking world knowledge but from failing to bring the right unstated background conditions forward as relevant — and forcing explicit enumeration of those preconditions jumps accuracy from 30% to 85% Do language models fail at identifying unstated preconditions?. 'Which background conditions matter right now?' and 'what counts as presupposed right now?' turn out to be the same question, and the QUD is what answers both. Presupposition isn't a property of sentences sitting in isolation — it's a property of sentences relative to what's being asked.

Sources 7 notes

Does projection strength vary by context or by word type?

Across 19 English expressions, projectivity varies continuously based on whether content addresses the Question Under Discussion. The same presupposition trigger projects more or less depending on context, not on fixed lexical properties.

Do language models miss presuppositions that arise from context?

LLMs learn statistical associations between trigger words and inferences, but presuppositions also arise through accommodation—updating context to resolve discourse mismatches. Models miss these because they require tracking questions under discussion, not pattern matching.

Why are presuppositions more persuasive than direct assertions?

Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.

Why do embedding contexts confuse LLM entailment predictions?

LLMs treat presupposition triggers and non-factive verbs as surface cues rather than computing their opposite semantic effects on entailments. This structural failure persists across prompts and models, suggesting models rely on surface patterns instead of structural analysis.

Why do language models accept false assumptions they know are wrong?

The FLEX Benchmark shows that models reject false presuppositions at rates far below acceptable levels (GPT-4: 84%, Mistral: 2.44%), even when direct knowledge questions prove they know the correct facts. False presuppositions drive more accommodation than correct knowledge drives rejection.

Why do language models struggle with questions containing false assumptions?

The (QA)2 benchmark found that zero-shot LLMs halve their performance when questions contain false or unverifiable assumptions compared to valid questions. Even top models reached only 56% acceptability, and the gap persists despite model scaling, suggesting false presuppositions embedded in plausible language are systematically difficult to reject.

Do language models fail at identifying unstated preconditions?

LLMs struggle not from lacking world knowledge but from failing to bring background conditions forward as relevant constraints. Prompting that forces explicit enumeration of preconditions raises accuracy from 30% to 85%, revealing the frame problem persists in statistical systems.

How does the Question Under Discussion shape what counts as presupposed?

Sources 7 notes

Next inquiring lines