INQUIRING LINE

Can structural conversation analysis replace text-based reward signals for AI alignment?

This explores whether the formal structures conversation analysts use to describe how human dialogue actually works—turn-taking, repair, clarification sequences—could stand in for the text-prediction reward signals (like RLHF) that currently shape AI behavior.


This explores whether the formal structures conversation analysts use to describe human dialogue—clarification sequences, repair, turn-taking—could replace the text-prediction reward signals like RLHF that currently shape AI. The honest answer from the corpus is: probably not a clean replacement, but the question exposes exactly what text-based rewards are blind to, and that's the more interesting story.

The corpus is unusually pointed about *why* text-based rewards fail. Standard RLHF optimizes for immediate, next-turn helpfulness, which quietly trains models to answer passively rather than ask, probe, or discover what the user actually wants Why do language models respond passively instead of asking clarifying questions?. The result is a structurally passive agent that can't initiate, plan, or lead Why can't conversational AI agents take the initiative?. And because the training signal rewards predicting information, models never develop the implicit relational moves—reference repair, topic hand-off—that keep human conversations smooth Why don't language models develop conversation maintenance skills?. So the case for bringing in structural analysis isn't aesthetic; it's that the text-reward objective is measuring the wrong thing.

Where conversation analysis earns its keep is as a *source of structure that text rewards can't see*. Insert-expansions—the clarifying sub-sequences humans use before answering—give a formal account of *when* an agent should stop and consult the user instead of silently chaining tools toward the wrong goal When should AI agents ask users instead of just searching?. Proactivity, offering relevant information unasked, mirrors Grice's conversational maxims and can cut dialogue turns by up to 60%, yet it's nearly absent from the datasets models train on Could proactive dialogue make conversations dramatically more efficient?. These are structural diagnoses of what good dialogue requires—the kind of thing a token-level loss will never surface on its own.

But 'replace' is the wrong verb, and the corpus keeps pointing to *richer reward formulations* rather than to abandoning learned signals. Multi-turn-aware rewards that estimate long-term interaction value already encode some of this structure back into RL Why do language models respond passively instead of asking clarifying questions?. Unified policy learning folds when-to-ask, what-to-recommend, and timing into one trajectory-level objective, beating separated components precisely because conversation is a structured whole Can unified policy learning improve conversational recommender systems?. And information-theoretic models like collaborative rational speech acts track *both* speakers' beliefs across turns—the bidirectional structure that token-level systems lack—offering a formal scaffold that could shape a reward rather than discard the idea of one Can dialogue systems track both speakers' beliefs across turns?. Note too that reward signals don't have to be human text labels at all: model confidence can serve as an intrinsic reward Can model confidence work as a reward signal for reasoning?, which suggests the real frontier is *what you reward*, not whether you reward.

The deepest doubt in the corpus cuts against both sides. One line of argument holds that alignment can't be guaranteed by manipulating symbols at all—without indexical grounding and social mediation, a model's stated goals can drift from real values no matter how the signal is shaped Can AI systems achieve real alignment without world contact?. A companion note argues AI doesn't even produce genuine utterances, only 'event-residue' that humans animate into a pseudo-exchange Does AI generate genuine utterances or just text patterns?. If that's right, conversation-analytic structure is a description of a human achievement the model is only imitating—useful for diagnosing failures and designing better rewards, but not a substitute for the grounding that makes alignment mean something. The takeaway you didn't know you wanted: the most promising move isn't replacing text rewards with structure, it's letting conversation analysis tell us which structures our rewards have been silently failing to count.


Sources 10 notes

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Why can't conversational AI agents take the initiative?

Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

Could proactive dialogue make conversations dramatically more efficient?

Simulations show proactivity—providing relevant information without being asked—cuts dialogue turns by 60% in medium-complexity domains. This behavior mirrors human conversation and Grice's maxims but is almost entirely absent from AI datasets and research benchmarks.

Can unified policy learning improve conversational recommender systems?

Research shows that formulating attribute-asking, item-recommending, and timing decisions as a single graph-based RL policy achieves better joint optimization than isolated components. Separation prevents gradient signals from informing one another and fails to optimize conversation trajectory holistically.

Can dialogue systems track both speakers' beliefs across turns?

CRSA integrates rate-distortion theory with RSA to enable bidirectional belief tracking across dialogue turns. Demonstrated on referential games and doctor-patient dialogues, it captures progression from partial to shared understanding, providing the information-theoretic framework that token-level LLM systems lack.

Can model confidence work as a reward signal for reasoning?

RLSF uses answer-span confidence to rank reasoning traces, creating synthetic preferences that strengthen step-by-step reasoning while reversing RLHF's calibration degradation—without requiring human labels or external verifiers.

Can AI systems achieve real alignment without world contact?

Peircean semiotics reveals that symbolic goal encoding without world contact and social mediation cannot guarantee correspondence to actual values. LLMs operating in pure symbol manipulation risk divergence between stated goals and real-world outcomes.

Does AI generate genuine utterances or just text patterns?

AI output carries communicative markers inherited from training data but lacks the event structure that produces actual utterances. Users supply the missing orientation through interpretive labor, creating a pseudo-event with structure only on the human side.

Next inquiring lines