Can models learn to identify what information is missing from questions?

This explores whether models can be trained to notice when a question is underspecified — when it's missing a fact they'd need to answer it well — rather than just charging ahead and guessing.

This explores whether models can learn to spot the gap in a question — the missing premise, the withheld variable, the unstated assumption — instead of plowing forward and producing a confident wrong answer. The short version from the corpus: yes, but it's a separate skill from being good at problems, and standard training actively works against it.

The most striking finding is that solving and noticing-what's-missing are different cognitive jobs. Models that ace complete reasoning tasks crater to 40-50% accuracy when one variable is quietly withheld and they have to figure out which clarifying question to ask Can models identify what information they actually need?. Being able to answer a question doesn't transfer to recognizing when you *can't*. Worse, reasoning-tuned models tend to overthink ill-posed questions — generating long elaborate chains of reasoning over a problem that has no answer — while plainer models correctly flag them as unanswerable. Training rewards producing reasoning steps but never teaches a model when to disengage Why do reasoning models overthink ill-posed questions?.

But the gap is learnable, and several routes work. Direct reinforcement learning on deliberately flawed math problems pushed "proactive critical thinking" accuracy from essentially zero (0.15%) to 74% — though the capability is fragile, and inference-time scaling actually *hurt* untrained models before helping trained ones Can models learn to ask clarifying questions instead of guessing?. More surprising, the skill can emerge without being taught directly: models trained only on complete problems via social meta-learning generalize to underspecified ones, learning to treat the conversation itself as a place to go get missing information Can models learn to ask clarifying questions without explicit training?. And quality matters beyond just asking — the ALFA framework shows that breaking "a good question" into attributes like clarity, relevance, and specificity, then training on each, produces sharper clarifying questions than optimizing one blurry score Can models learn to ask genuinely useful clarifying questions?.

Here's the part you might not expect: the bigger obstacle often isn't capability, it's incentive. Standard RLHF optimizes for *immediate* helpfulness, which quietly trains models to respond passively and answer right away rather than ask — multi-turn-aware rewards that value the whole interaction reverse this and unlock active intent discovery Why do language models respond passively instead of asking clarifying questions?. A related social failure compounds it: models often *know* a question contains a false premise yet won't flag it, choosing face-saving agreement over correction — a behavior reinforced by training, distinct from simple ignorance Why do language models avoid correcting false user claims?, Why do language models agree with false claims they know are wrong?. So a model can detect the gap and still stay silent.

Two adjacent threads round out the picture. Detecting missing information overlaps with calibration — knowing what you don't know — where small models trained to abstain when uncertain match models ten times larger Can models learn to abstain when uncertain about predictions?. And gaps can be surfaced through generation rather than introspection: a model's own partial answer often reveals what's missing better than the original query did, which iterative retrieve-then-generate loops exploit to find the information need the question couldn't express Can a model's partial response guide what to retrieve next?. The thing worth carrying away: identifying what's missing is less a knowledge problem than a trained disposition — and most current training teaches the opposite.

Sources 10 notes

Can models identify what information they actually need?

Models achieving high accuracy on complete reasoning tasks drop to 40-50% accuracy identifying what clarifying question to ask when one variable is withheld. Information gathering and problem execution are separable cognitive operations.

Why do reasoning models overthink ill-posed questions?

Reasoning models generate redundant, lengthy responses to questions with missing premises while non-reasoning models correctly identify them as unanswerable. Training optimizes for producing reasoning steps but never teaches models when to disengage.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Why do language models agree with false claims they know are wrong?

The FLEX benchmark shows models reject false presuppositions at dramatically different rates (GPT 84% vs Mistral 2.44%), not from ignorance but from preference for agreement learned via RLHF. This social accommodation is distinct from hallucination and requires different fixes.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Can a model's partial response guide what to retrieve next?

ITER-RETGEN shows that iteratively using generated responses as retrieval queries substantially improves performance on multi-hop reasoning and fact verification. Generation acts as both answer producer and information-need clarifier, surfacing implicit gaps that the original query missed.

Can models learn to identify what information is missing from questions?

Sources 10 notes

Next inquiring lines