Conversational AI Systems Psychology and Social Cognition

Why do language models respond passively instead of asking clarifying questions?

Explores whether the reward signals used to train language models might actively discourage them from seeking clarification or taking initiative in conversations, and what alternative training approaches might enable more collaborative dialogue.

Note · 2026-02-22 · sourced from Conversation Agents
Why do AI agents fail to take initiative? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

CollabLLM makes the training mechanism behind passive responding explicit: "Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction." The result: models respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations.

The fix is multi-turn-aware rewards — rewards that estimate the long-term contribution of a response to the overall interaction quality, not just its immediate helpfulness. By reinforcement fine-tuning with these rewards, CollabLLM enables models to:

This is a direct mechanism explanation for the alignment tax. Since Does preference optimization harm conversational understanding?, we know that RLHF training degrades multi-turn reliability. CollabLLM identifies the specific training signal responsible: next-turn rewards. And it proposes the specific fix: rewards that account for multi-turn consequences.

The connection to proactivity is also direct. Since Why can't conversational AI agents take the initiative?, the passivity is not just a missing feature — it is actively trained in by next-turn reward optimization. You cannot add proactivity on top of a training signal that rewards only reactive helpfulness.

The CollabLLM framework evaluates on three challenging tasks including document creation — contexts where multi-turn collaboration is essential and single-turn helpfulness is insufficient. This grounds the claim in practical interaction scenarios rather than abstract capability measurement.

The Intent Mismatch paper directly supports this causal mechanism: it argues premature assumptions in multi-turn conversation are rational under RLHF helpfulness training. Models construct plausible task formulations for "typical" users and produce provisional answers because the training objective penalizes evasion and rewards helpfulness. The proposed fix — a Mediator-Assistant architecture that decouples intent understanding from task execution — complements CollabLLM's reward-signal approach with an architectural intervention. Both identify next-turn optimization as the root cause; they differ on whether the fix is changing the reward (CollabLLM) or restructuring the system (Intent Mismatch).


Source: Conversation Agents, Conversation Topics Dialog

Related concepts in this collection

Concept map
18 direct connections · 130 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

next-turn reward optimization limits multi-turn collaboration — multi-turn-aware rewards enable models to actively uncover intent rather than passively respond