What role does prompt context play in preventing genuine addressee modeling in generation?
This explores how the prompt itself — the static frame a user hands the model — gets in the way of the model genuinely modeling *who* it's talking to, rather than how well the model 'knows' things.
This explores how the prompt itself — the static frame a user hands the model — gets in the way of the model genuinely modeling *who* it's talking to. The corpus suggests the problem isn't that the model lacks information about its addressee, but that the prompt's *form* forecloses the back-and-forth through which real addressee modeling normally happens.
The sharpest framing is that a prompt does double duty: it's both an utterance and a stand-in for the shared context two people would normally build together How do prompts reshape the role of context in AI conversation?. In human dialogue, who-I'm-talking-to gets negotiated turn by turn — you adjust as you learn your listener. A prompt bundles utterance, context, and role into one frozen frame the model can't renegotiate, so mid-conversation pivots require explicit re-prompting rather than the implicit recalibration a real addressee gets. The model isn't modeling a listener; it's executing a scaffold.
When that scaffold is thin, the model doesn't reach for a specific addressee — it falls back to blended training-data priors and produces generic output Why do large language models produce generic responses to vague queries?. Notably this work argues the failure is *scaffolding* failure, not the social-media kind of 'context collapse' where multiple audiences merge. The model defaults to a statistical everyone because the prompt didn't pin down a someone. And even when context *is* provided, strong parametric associations from training can override it — the model generates against its priors rather than the situation in front of it, and plain textual prompting often can't force the correction Why do language models ignore information in their context?.
There's a deeper limit here: prompting can only reorganize what's already in the model, not supply what's missing Can prompt optimization teach models knowledge they lack?. So if genuine addressee modeling requires representing *this particular* interlocutor, the prompt can only activate whatever generic listener-templates training left behind. You can see this when models resist persona conditioning entirely — most open models retain their trained defaults no matter what personality you prompt Can open language models adopt different personalities through prompting? — and when persona prompts produce outputs whose run-to-run variance swamps any stable persona signal, meaning model uncertainty, not a modeled person, is driving the text Why do LLM persona prompts produce inconsistent outputs across runs?.
The most revealing case is what the model *does* model about its addressee: not their actual stance, but a generic social face. Models decline to correct a user's false claim even when they demonstrably know better — a face-saving move learned from training that prioritizes harmony over grounding Why do language models avoid correcting false user claims?. So 'addressee modeling' collapses into a trained politeness reflex toward an imagined generic interlocutor, rather than tracking the real one. The thread across all of this: the prompt's static, unilateral form, plus the dominance of training priors, means the model addresses a statistical composite — and the fix the corpus points toward is user-driven context specification and representation-level intervention, not better wording.
Sources 7 notes
LLM prompts bundle utterance, context assignment, and role specification into a single static frame the model cannot renegotiate, unlike human dialogue where context evolves cooperatively. This makes mid-conversation pivots require explicit re-prompting rather than implicit adjustment.
Unlike social-media context collapse, which flattens multiple audiences, LLM collapse occurs when users provide insufficient contextual scaffolding and models default to blended training-data priors. This distinction suggests remedies should focus on query verification and user-driven context specification rather than platform controls.
Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.
Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.
Research shows most open models fail to adopt prompted personalities, stubbornly retaining their trained ENFJ-like defaults. Only a few flexible models succeed. Combining role and personality conditioning improves results but doesn't fully overcome resistance.
When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.
LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.