Can presupposition projection strength vary by context in embeddings?
This explores whether the strength with which a presupposition 'projects' (survives being embedded under negation, questions, conditionals) shifts with context — and whether the embedding-based models behind today's LLMs actually track that shift.
This reads the question two ways at once — as a fact about language (does projection strength move with context?) and as a fact about machines (can models built on embeddings capture that movement?). The corpus has a clean answer to the first and a stubborn 'no' to the second. On the linguistics: projection is not a fixed property of a word. Across 19 English triggers, how strongly content projects turns out to be gradient and driven by at-issueness — whether the content speaks to the current Question Under Discussion — rather than by lexical class. The same trigger projects more in one context and less in another Does projection strength vary by context or by word type?. So yes: projection strength varies by context, and it does so continuously, not in fixed buckets per word type.
The harder finding is what happens when you ask the models to honor that. Embedding contexts behave as systematic 'blinds': LLMs treat presupposition triggers and non-factive verbs as surface cues and miss that they compute opposite effects on entailments, a failure that persists across prompts and models Why do embedding contexts confuse LLM entailment predictions?. The reason is structural — presuppositions have a dual origin. Some are lexically specified, but many arise through accommodation, where the listener quietly updates the context to resolve a mismatch. Models learn the lexical, statistical half and miss the conversationally derived half, because catching it requires tracking the Question Under Discussion rather than pattern-matching trigger words Do language models miss presuppositions that arise from context?. That's the same QUD machinery that governs gradient projection in the first place — which is exactly why the context-sensitivity exists in the data but evaporates in the model.
It's worth being precise about where the failure lives, because the embeddings themselves are not empty. Static embeddings encode genuinely rich semantic content — valence, concreteness, iconicity, taboo — before self-attention ever runs, which rules out the idea that there's simply 'no meaning in there' Do transformer static embeddings actually encode semantic meaning?. The breakdown is at the contextual-computation layer, not the lexical one. Models fail to integrate context when prior training associations are strong enough to override it — and textual prompting alone can't fix that; it takes intervention in the representations Why do language models ignore information in their context?. You can watch this directly: LLMs accommodate false presuppositions even when a direct question proves they know the fact is wrong, with rejection rates collapsing as low as 2.44% for some models Why do language models accept false assumptions they know are wrong?.
The deeper pattern, and the thing you might not have come looking for: this is the same crack that shows up across pragmatics. Scalar implicature is the close cousin of presupposition, and models show no context-sensitivity there either — they don't flex their inferences for literal-mode instructions, information focus, or face-threatening situations the way humans do Can language models adapt implicature to conversational context?. Underneath both sits a frequency bias: models systematically prefer high-frequency surface forms over meaning-equivalent rare ones, suggesting they track statistical mass rather than recompute meaning per context Do language models really understand meaning or just surface frequency?. So the honest synthesis is a split verdict — projection strength genuinely is context-variable in human language, but the context-tracking that produces that variability (at-issueness, QUD, accommodation) is precisely the competence current embedding-based models lack.
Sources 8 notes
Across 19 English expressions, projectivity varies continuously based on whether content addresses the Question Under Discussion. The same presupposition trigger projects more or less depending on context, not on fixed lexical properties.
LLMs treat presupposition triggers and non-factive verbs as surface cues rather than computing their opposite semantic effects on entailments. This structural failure persists across prompts and models, suggesting models rely on surface patterns instead of structural analysis.
LLMs learn statistical associations between trigger words and inferences, but presuppositions also arise through accommodation—updating context to resolve discourse mismatches. Models miss these because they require tracking questions under discussion, not pattern matching.
Clustering analysis of RoBERTa embeddings reveals sensitivity to five psycholinguistic measures including valence, concreteness, iconicity, and taboo. This demonstrates that static embeddings function as genuine lexical entries containing semantic content before self-attention operates.
Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.
The FLEX Benchmark shows that models reject false presuppositions at rates far below acceptable levels (GPT-4: 84%, Mistral: 2.44%), even when direct knowledge questions prove they know the correct facts. False presuppositions drive more accommodation than correct knowledge drives rejection.
ChatGPT shows no context-sensitivity in computing scalar implicatures across three dimensions: explicit literal-mode instructions, information structure focus, and face-threatening contexts. Humans flexibly modulate these inferences; the model does not, suggesting pragmatic competence requires tracking communicative stakes that LLMs systematically miss.
LLMs show consistent preference for higher-frequency surface forms over semantically equivalent rare paraphrases across math, machine translation, commonsense reasoning, and tool calling. This suggests models track statistical mass from pretraining rather than meaning-recognition as their primary mechanism.