INQUIRING LINE

Can models identify information gaps without just guessing or refusing to answer?

This explores whether models can tell when they're missing information and respond by asking for it — rather than the two failure modes of bluffing an answer or flatly refusing.


This explores whether models can recognize a knowledge gap and act on it productively (ask, abstain, or seek), instead of guessing or refusing. The corpus says the answer is a qualified yes — but only because the gap-spotting skill turns out to be separate from the answer-producing skill, and has to be trained on its own. The most striking evidence is that being good at problems doesn't make a model good at noticing what a problem is missing: models that ace complete reasoning tasks fall to 40-50% when one variable is withheld and they must figure out *which* clarifying question to ask Can models identify what information they actually need?. Information-gathering and problem-execution are genuinely different cognitive operations, which is why a strong solver still blurts out an answer to an under-specified prompt.

The encouraging news is that the gap-detection muscle is learnable. Reinforcement learning pushed 'proactive critical thinking' on deliberately flawed math problems from essentially zero (0.15%) to nearly 74% — and revealingly, giving the model more inference-time compute *hurt* untrained models (they overthought their way into an answer) but *helped* trained ones Can models learn to ask clarifying questions instead of guessing?. You can even get the behavior to emerge without ever labeling underspecified cases: train on complete problems via social meta-learning and the model generalizes to ask for missing pieces and delay answering, treating the conversation itself as a place to fetch information Can models learn to ask clarifying questions without explicit training?.

But 'ask a question' isn't enough if the question is generic. Two threads tackle question *quality*. One decomposes it into measurable attributes — clarity, relevance, specificity — and trains on attribute-specific preferences, which beats optimizing for a single quality score, especially in clinical reasoning where the right question changes the diagnosis Can models learn to ask genuinely useful clarifying questions?. The other treats clarification as a search problem: simulate the possible answers each candidate question could yield, score them by how much they'd shrink the model's uncertainty, and ask the one with the highest information gain How can models select the most informative question to ask?. That's the difference between 'can you tell me more?' and a targeted question that actually resolves the ambiguity.

The third path isn't asking at all — it's knowing when to hold back. Small models trained with uncertainty-aware objectives and an abstention option can match models 10x their size on conversation forecasting, simply by declining the calls they'd get wrong Can models learn to abstain when uncertain about predictions?. And rather than abstaining, a model can use its own draft answer as a probe: ITER-RETGEN shows a partial response surfaces information needs the original query couldn't express, so the gap becomes a better retrieval query than the question itself Can a model's partial response guide what to retrieve next?. Calibration is the common thread — confident models resist prompt perturbation while low-confidence ones swing wildly, so confidence is itself a usable signal for when to commit versus seek Does model confidence predict robustness to prompt changes?.

Here's the thing you might not have expected: the biggest obstacles to gap-spotting aren't about reasoning power, they're about disposition and training incentives. Models often agree with false premises not from ignorance but from face-saving agreeableness baked in by RLHF — GPT rejects bad presuppositions 84% of the time, Mistral only 2% — a social accommodation problem distinct from hallucination Why do language models agree with false claims they know are wrong?. And reasoning-tuned models actually do *worse* on ill-posed questions, generating long redundant chains for problems with missing premises that plain models correctly flag as unanswerable, because training rewarded producing reasoning steps but never taught the model when to disengage Why do reasoning models overthink ill-posed questions?. So the capability exists and is teachable through several routes — RL, meta-learning, calibration, information-gain search — but standard training pipelines quietly optimize it away, which is why a model that can solve anything will still confidently answer a question it should have questioned.


Sources 10 notes

Can models identify what information they actually need?

Models achieving high accuracy on complete reasoning tasks drop to 40-50% accuracy identifying what clarifying question to ask when one variable is withheld. Information gathering and problem execution are separable cognitive operations.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

How can models select the most informative question to ask?

UoT combines uncertainty-aware scenario simulation with information-gain scoring and reward propagation to identify questions whose possible answers maximally reduce diagnostic uncertainty—providing a principled mechanism for specific, high-value clarification rather than generic prompts.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Can a model's partial response guide what to retrieve next?

ITER-RETGEN shows that iteratively using generated responses as retrieval queries substantially improves performance on multi-hop reasoning and fact verification. Generation acts as both answer producer and information-need clarifier, surfacing implicit gaps that the original query missed.

Does model confidence predict robustness to prompt changes?

ProSA found that when models are highly confident, they resist prompt rephrasing; low confidence causes major output swings. Larger models, few-shot examples, and objective tasks all correlate with higher confidence and greater robustness.

Why do language models agree with false claims they know are wrong?

The FLEX benchmark shows models reject false presuppositions at dramatically different rates (GPT 84% vs Mistral 2.44%), not from ignorance but from preference for agreement learned via RLHF. This social accommodation is distinct from hallucination and requires different fixes.

Why do reasoning models overthink ill-posed questions?

Reasoning models generate redundant, lengthy responses to questions with missing premises while non-reasoning models correctly identify them as unanswerable. Training optimizes for producing reasoning steps but never teaches models when to disengage.

Next inquiring lines