Why do specific clarifying questions outperform generic requests for clarity?

This explores why a clarifying question that targets a concrete information gap ("What kind of monitor?") beats one that throws the work back to the user ("What are you trying to do?") — and what the corpus says about the mechanics behind that gap.

This explores why a clarifying question that names a specific missing detail outperforms a generic plea for more context, and the corpus converges on a single idea from several directions: a good clarifying question does the cognitive work *for* the user instead of handing it back to them. The most direct evidence is that facet-specific questions consistently beat need-rephrasing ones on user satisfaction, and the reason given is foresight — people engage when they can see how their answer will improve the result Which clarifying questions actually improve user satisfaction?. "What type of monitor?" tells you the gap is closeable and worth closing; "What are you trying to do?" asks you to re-do the work of specifying the problem you thought you'd already stated.

Underneath that satisfaction gap sits a mechanism worth seeing: generic requests are often a symptom of the same failure that produces generic *answers*. When a model has too little scaffolding, it falls back on blended training-data priors — a kind of context collapse — and a vague "tell me more" is the question-shaped version of that same averaging-out Why do large language models produce generic responses to vague queries?. A specific question, by contrast, is evidence the model has already localized *where* its uncertainty lives. That's exactly what the information-gain approaches formalize: simulate the possible answers to a candidate question, score how much each would shrink the uncertainty, and ask the one that cuts deepest How can models select the most informative question to ask?. Specificity isn't a stylistic choice there — it's the output of picking the highest-value probe rather than a catch-all.

The corpus also shows that "good question" isn't one quality but several, which is why generic requests underperform — they're under-optimized on every axis at once. The ALFA framework decomposes question quality into attributes like clarity, relevance, and specificity, and training on those attributes separately beats training on a single satisfaction score, most sharply in clinical reasoning where the right specific question changes the diagnosis Can models learn to ask genuinely useful clarifying questions?. A vague question scores low on relevance and specificity simultaneously; a targeted one is what you get when you optimize those independently.

What's less obvious — and the part you might not know you wanted — is that the harder problem isn't phrasing the question but knowing *that* one is needed at all. Several notes show models default to answering rather than asking: they overthink ill-posed problems with missing premises instead of flagging them as unanswerable Why do reasoning models overthink ill-posed questions?, and proactive-critical-thinking training lifts the rate of catching deliberately flawed problems from near-zero to ~74% Can models learn to ask clarifying questions instead of guessing?. Specificity, in other words, is downstream of detection: a model that has actually identified the precise missing fact can ask about it precisely, while one that's merely uneasy can only gesture. Social meta-learning gets at the same thing from the other side — train on complete problems and models still generalize to *ask for the specific missing piece* on underspecified ones, because they've learned to treat conversation as an information source rather than a prompt to guess Can models learn to ask clarifying questions without explicit training?.

One last lateral wrinkle: most real clarification doesn't even look like a question. Mapping clarification onto Clark's levels of communication shows that the majority of human repair is declarative ("I meant the 27-inch one"), not interrogative — which means systems that detect clarification by question-syntax alone are blind to most of it Why do clarification requests look different at each communication level?. The deeper lesson across all of this is the same: effective clarification is targeted repair of a located gap, and "can you be more specific?" fails precisely because it has located nothing.

Sources 8 notes

Which clarifying questions actually improve user satisfaction?

Clarifying questions that target concrete information gaps ("What type of monitor?") consistently beat those that ask users to rephrase their needs ("What are you trying to do?"). Users engage most when they can foresee how answering improves results.

Why do large language models produce generic responses to vague queries?

Unlike social-media context collapse, which flattens multiple audiences, LLM collapse occurs when users provide insufficient contextual scaffolding and models default to blended training-data priors. This distinction suggests remedies should focus on query verification and user-driven context specification rather than platform controls.

How can models select the most informative question to ask?

UoT combines uncertainty-aware scenario simulation with information-gain scoring and reward propagation to identify questions whose possible answers maximally reduce diagnostic uncertainty—providing a principled mechanism for specific, high-value clarification rather than generic prompts.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

Why do reasoning models overthink ill-posed questions?

Reasoning models generate redundant, lengthy responses to questions with missing premises while non-reasoning models correctly identify them as unanswerable. Training optimizes for producing reasoning steps but never teaches models when to disengage.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Why do clarification requests look different at each communication level?

Research maps clarification mechanisms to four levels of communication—attention, signal, meaning, action—each grounded in a different modality (socioperception, hearing, vision, kinesthetics). Most clarifications use declarative form, not questions, making them invisible to systems that detect by syntax alone.

Why do specific clarifying questions outperform generic requests for clarity?

Sources 8 notes

Next inquiring lines