Language Understanding and Pragmatics Conversational AI Systems Psychology and Social Cognition

Why don't language models develop conversation maintenance skills?

Explores whether systems trained on text can learn the implicit techniques humans use to keep conversations on track, and why those techniques might resist the standard training approach.

Note · 2026-04-14

A conversation that runs smoothly is doing constant maintenance work. Speakers track who is talking, what each party knows, where the topic has been, where it is going. They reference prior turns without restating them. They repair misunderstandings without flagging the repair. They hand off topics through subtle pivots. They update common ground each turn without explicit acknowledgment. The maintenance is so pervasive and so implicit that it is invisible to participants — they only notice when it fails.

These techniques are not features of language understood as an information-encoding system. They are features of language understood as social action. Their function is not to convey information; it is to sustain a relational interaction in which information conveyance happens. A linguistic act can convey identical information with or without the maintenance work — the difference is whether the act sustains the conversation or breaks it. Maintenance is orthogonal to content.

This explains why systems trained on language as information expression do not develop maintenance techniques. The training signal does not include the relational stakes that make maintenance work valuable. Text-corpus training rewards models for predicting the next token in a string; nothing in the loss function rewards them for performing the implicit reference, repair, or update operations that maintain conversation. The operations are not in the data because they live below the level of what data encodes — they live in the doing-with-the-data, not in the data itself.

This connects to a broader theoretical claim about language. Information-theoretic treatments of language model meaning as content the speaker encodes and the receiver decodes. Pragmatic and interactionist treatments model meaning as a relational achievement, partly produced by the maintenance work that information-theoretic accounts cannot describe. The two treatments make different predictions about what an artificial language-system needs to do to participate in conversation. Information-theoretic predicts: produce informative content. Pragmatic predicts: perform maintenance. AI's empirical conversational failures favor the pragmatic prediction — the missing thing is not informativeness but maintenance.

The diagnostic implication is that "more conversational data" cannot close the maintenance gap, because the data does not contain the maintenance — it contains the conversations that maintenance produced. Adding data adds more output; what is missing is the operation that produced the output. Closing the gap would require training on the operation (agents in actual interaction performing maintenance) rather than on the artifacts of operation (text logs of conversations that included maintenance).

Why do dialogue failures persist despite scaling language models? is the training-mode claim; this is the operation-vs-artifact distinction that the training mode encodes. Together they specify why dialogue-data scaling has produced limited progress on maintenance-specific failures.

The strongest counterargument: maintenance can be inferred from conversational data with sufficient model sophistication. Possible at the limit, but inference of maintenance from text is asking the model to recover the operation from its surface effects — a much harder problem than learning the operation directly. The empirical pattern is consistent with this difficulty.

Source: Communication vs Language

Related concepts in this collection

Why do dialogue failures persist despite scaling language models? If LLMs get better at text tasks with more training data, why don't dialogue-specific problems improve the same way? The question explores whether dialogue failures are capability gaps or structural training mismatches.
the training-mode claim this specifies the operational consequence of
Why don't conversational AI systems mirror their users' word choices? Explores whether current dialogue models exhibit lexical entrainment—the human tendency to align vocabulary with conversation partners—and what's needed to bridge this gap in AI communication.
one of the specific maintenance operations this names the missing-operation pattern of
Why do language models skip the calibration step? Current LLMs assume shared understanding rather than building it through dialogue. This explores why that design choice persists and what breaks when it fails.
companion claim about common-ground maintenance

Concept map

13 direct connections · 127 in 2-hop network ·dense cluster

Why don't language models develop conversation m… Why do dialogue failures persist despite scaling l… Why don't conversational AI systems mirror their u… Why do language models skip the calibration step?

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

conversation maintenance techniques are implicit and belong to language as social action not language as information expression