Design & LLM Interaction Psychology and Social Cognition LLM Reasoning and Architecture

Can models learn to ask clarifying questions instead of guessing?

Exploring whether large language models can be trained to detect incomplete queries and actively request missing information rather than hallucinating answers or refusing to respond. This matters because conversational agents today remain passive, responding only when prompted.

Note · 2026-02-22 · sourced from Conversation Agents
Why do AI agents fail to take initiative? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Current LLMs face three failure modes when receiving flawed or incomplete queries: they hallucinate an answer, they refuse to respond, or they provide a generic "I need more information" deflection. None of these is productive. The proactive critical thinking paradigm introduces a fourth option: identify specifically what is missing and generate a targeted question to request it.

The GSM-MC benchmark tests this by deliberately removing key variables from math problems. Results are dramatic:

The near-zero baseline reveals something important: despite extensive post-training that makes these models excellent at reasoning, they have almost no ability to detect when a problem is ill-posed and actively seek the missing piece. This is a specific capability gap, not a general reasoning limitation.

A striking secondary finding: inference-time scaling (activating "thinking mode") actually degrades proactive critical thinking in vanilla models. The extended thinking induces "counterproductive self-doubt rather than useful analysis, leading to a clear drop in performance." But after RL training, thinking mode becomes beneficial — the same mechanism that hurts untrained models helps trained ones.

This finding matters beyond math: a patient omitting critical symptoms, a user providing incomplete specifications, a student asking an ambiguous question — all require the agent to identify what's missing and ask, not just refuse or guess. Since Why can't conversational AI agents take the initiative?, proactive critical thinking is a concrete, trainable instantiation of the broader proactivity gap.

ProCoT (Proactive Chain-of-Thought) extends the paradigm from individual queries to multi-turn goal planning: rather than just detecting missing information in a single exchange, models generate explicit reasoning chains about conversation goals and plan proactive interventions across turns. This bridges proactive critical thinking (reactive: "this query is incomplete") with proactive dialogue (strategic: "given the user's goal, I should ask about X before they realize they need it").

The ALFA framework for clinical reasoning extends this by showing that question quality is multidimensional — a question can be clear but irrelevant, or relevant but ambiguous. ALFA decomposes "good question" into theory-grounded attributes (clarity, relevance, specificity) and trains against each via 80K attribute-specific preference pairs. This addresses a gap: proactive critical thinking shows models can learn to ask, but ALFA shows they need attribute-specific training to ask well. Additionally, research on clarifying question design shows that specific-facet questions ("What type of monitor?") consistently outperform need-rephrasing questions ("Can you be more specific?") for user satisfaction — the form of the question matters as much as the decision to ask.


Source: Conversation Agents, Conversation Topics Dialog, Conversation Architecture Structure

Related concepts in this collection

Concept map
26 direct connections · 174 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

proactive critical thinking enables models to identify missing information and actively request clarification rather than passively refusing or hallucinating answers