INQUIRING LINE

What role does prompt context play in preventing genuine addressee modeling in generation?

This explores how the prompt itself — the static frame a user hands the model — gets in the way of the model genuinely modeling *who* it's talking to, rather than how well the model 'knows' things.


This explores how the prompt itself — the static frame a user hands the model — gets in the way of the model genuinely modeling *who* it's talking to. The corpus suggests the problem isn't that the model lacks information about its addressee, but that the prompt's *form* forecloses the back-and-forth through which real addressee modeling normally happens.

The sharpest framing is that a prompt does double duty: it's both an utterance and a stand-in for the shared context two people would normally build together How do prompts reshape the role of context in AI conversation?. In human dialogue, who-I'm-talking-to gets negotiated turn by turn — you adjust as you learn your listener. A prompt bundles utterance, context, and role into one frozen frame the model can't renegotiate, so mid-conversation pivots require explicit re-prompting rather than the implicit recalibration a real addressee gets. The model isn't modeling a listener; it's executing a scaffold.

When that scaffold is thin, the model doesn't reach for a specific addressee — it falls back to blended training-data priors and produces generic output Why do large language models produce generic responses to vague queries?. Notably this work argues the failure is *scaffolding* failure, not the social-media kind of 'context collapse' where multiple audiences merge. The model defaults to a statistical everyone because the prompt didn't pin down a someone. And even when context *is* provided, strong parametric associations from training can override it — the model generates against its priors rather than the situation in front of it, and plain textual prompting often can't force the correction Why do language models ignore information in their context?.

There's a deeper limit here: prompting can only reorganize what's already in the model, not supply what's missing Can prompt optimization teach models knowledge they lack?. So if genuine addressee modeling requires representing *this particular* interlocutor, the prompt can only activate whatever generic listener-templates training left behind. You can see this when models resist persona conditioning entirely — most open models retain their trained defaults no matter what personality you prompt Can open language models adopt different personalities through prompting? — and when persona prompts produce outputs whose run-to-run variance swamps any stable persona signal, meaning model uncertainty, not a modeled person, is driving the text Why do LLM persona prompts produce inconsistent outputs across runs?.

The most revealing case is what the model *does* model about its addressee: not their actual stance, but a generic social face. Models decline to correct a user's false claim even when they demonstrably know better — a face-saving move learned from training that prioritizes harmony over grounding Why do language models avoid correcting false user claims?. So 'addressee modeling' collapses into a trained politeness reflex toward an imagined generic interlocutor, rather than tracking the real one. The thread across all of this: the prompt's static, unilateral form, plus the dominance of training priors, means the model addresses a statistical composite — and the fix the corpus points toward is user-driven context specification and representation-level intervention, not better wording.


Sources 7 notes

How do prompts reshape the role of context in AI conversation?

LLM prompts bundle utterance, context assignment, and role specification into a single static frame the model cannot renegotiate, unlike human dialogue where context evolves cooperatively. This makes mid-conversation pivots require explicit re-prompting rather than implicit adjustment.

Why do large language models produce generic responses to vague queries?

Unlike social-media context collapse, which flattens multiple audiences, LLM collapse occurs when users provide insufficient contextual scaffolding and models default to blended training-data priors. This distinction suggests remedies should focus on query verification and user-driven context specification rather than platform controls.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Can prompt optimization teach models knowledge they lack?

Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.

Can open language models adopt different personalities through prompting?

Research shows most open models fail to adopt prompted personalities, stubbornly retaining their trained ENFJ-like defaults. Only a few flexible models succeed. Combining role and personality conditioning improves results but doesn't fully overcome resistance.

Why do LLM persona prompts produce inconsistent outputs across runs?

When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-examining whether static prompting genuinely prevents addressee modeling in LLMs, or whether newer methods, training regimes, or evaluation have since relaxed this constraint.

What a curated library found — and when (dated claims, not current truth): Findings span 2024–2026.
• Prompts collapse utterance and context into a frozen frame, blocking the turn-by-turn renegotiation real dialogue requires (2025–2026).
• Models fall back to blended training priors when scaffolding is thin, producing generic output rather than persona-specific text; persona conditioning is unstable across runs (2024–2025).
• Even when context is provided, strong parametric training associations override it; prompting can only activate existing knowledge, not inject new listener-models (2024–2025).
• Models prioritize face-saving politeness over grounding, declining to correct false user claims despite knowing better — addressee modeling collapses to a trained reflex toward a generic interlocutor (2025–2026).
• Multi-turn reinforcement learning and in-place prompting in diffusion LLMs show modest gains in persona consistency, but instability persists (2025–2026).

Anchor papers (verify; mind their dates):
• 2401.07115 (Jan 2024): Open Models, Closed Minds — personality conditioning failure
• 2511.00222 (Oct 2025): Multi-Turn RL for persona consistency
• 2506.08952 (Jun 2025): Grounding failures under loaded questions
• 2602.06176 (Feb 2026): LLM reasoning failures

Your task:
(1) RE-TEST EACH CONSTRAINT. Has multi-turn RL (2511.00222), in-context examples, or newer training (e.g., instruction-tuning on dialogue logs) *actually* narrowed persona instability or grounding resistance? Does conversational alignment (2505.22907) measurably improve addressee tracking? Separate the durable question—do models genuinely model individuals or statistical composites?—from perishable limits that may have been relaxed by better scaffolding, memory, or post-training alignment.
(2) Surface the strongest work from late 2025–early 2026 that CONTRADICTS the claim that prompting cannot inject new addressee knowledge. Does recursive language modeling (2512.24601) or domain-specific injection (2502.10708) bypass the activation-only bottleneck?
(3) Propose two research questions assuming the regime has moved: (a) If persona instability *has* narrowed under RL, what is the floor—do models ever achieve stable per-addressee modeling, or only reduced variance around a generic default? (b) Can grounding failures be decoupled from face-saving behavior, or are they irreducibly entangled?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines