Why do current conversational AI systems fail to develop shared vocabulary with users?

This explores why AI chat systems don't build up a shared way of talking with you over a conversation — picking up your words, conventions, and references — and the corpus traces it back to what training actually rewards.

This explores why AI chat systems don't build up a shared way of talking with you over a conversation — adopting your words, your conventions, your references — the way two people naturally do. The corpus points to a single root cause showing up in many guises: models are trained to predict information, not to do the relational work of conversation. The clearest case is lexical entrainment — humans unconsciously converge on each other's word choices to build rapport and reduce ambiguity, and current response models simply don't do it; vocabulary adaptation toward the user is absent despite being foundational to human dialogue Why don't conversational AI systems mirror their users' word choices?. Shared vocabulary is one instance of a broader missing layer: the implicit maintenance techniques — reference repair, topic hand-offs, convention-building — that keep dialogue smooth. These never develop because training signals reward predicting the next informative token, not sustaining a relationship Why don't language models develop conversation maintenance skills?.

Look one level down and the failure is structural, not accidental. Standard RLHF optimizes for immediate, single-turn helpfulness, which actively discourages the behaviors that build shared ground — asking clarifying questions, establishing conventions that pay off later turns. When the reward only looks at the next turn, the model has no incentive to invest in a vocabulary that becomes useful three turns from now Why do language models respond passively instead of asking clarifying questions?. The same logic makes agents structurally passive: they can't initiate, plan, or lead, because alignment optimizes for responding to queries rather than acting from goals — and co-building a shared language requires initiative, not just reaction Why can't conversational AI agents take the initiative?.

There's also a deeper architectural gap. Building shared vocabulary means tracking what both speakers believe and how that converges from partial to mutual understanding. Token-level LLMs lack the machinery for this; frameworks like collaborative rational speech acts add bidirectional belief tracking precisely because it's the information-theoretic layer LLMs don't have Can dialogue systems track both speakers' beliefs across turns?. Without modeling the other mind, there's no "shared" to converge on — only fluent output that sounds like it understands.

Here's the turn you might not expect: this isn't one bug but a category error baked into design. AI interfaces borrow conversational conventions, which switches on your lifelong communication skills — including the expectation that your partner will adopt your terms — but the system isn't actually communicating, just producing strings Why do users fail with AI interfaces designed like conversations?. And the fix isn't uniform: lexical alignment specifically drives task efficiency and comprehension, while emotional and prosodic alignment drive warmth and trust — conflating them produces cold or evasive bots Do different types of alignment serve different conversational goals?. The encouraging note is that shared vocabulary is teachable: post-training on coreference-identified user preferences can give models in-context convention formation Why don't conversational AI systems mirror their users' word choices?. The absence is a consequence of what we optimize for — not a hard limit of what these systems could do.

Sources 7 notes

Why don't conversational AI systems mirror their users' word choices?

Response generation models fail to adapt vocabulary toward users' lexical choices, a phenomenon central to human rapport and clarity. Post-training via DPO on coreference-identified preferences can teach models in-context convention formation.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Why can't conversational AI agents take the initiative?

Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.

Can dialogue systems track both speakers' beliefs across turns?

CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.

Why do users fail with AI interfaces designed like conversations?

AI interfaces that use conversational design conventions trigger users' lifelong communication skills, but AI doesn't actually communicate. This mismatch causes interaction failures that feel like user error but originate in design.

Do different types of alignment serve different conversational goals?

A 2020–2025 systematic review shows lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive relational warmth and trust. Conflating them in design produces category errors—cold customer-service bots and evasive mental-health assistants.

Why do current conversational AI systems fail to develop shared vocabulary with users?

Sources 7 notes

Next inquiring lines