Can models learn to ask clarifying questions without explicit training?
Do language models trained only on fully-specified problems spontaneously develop the ability to ask for missing information when facing underspecified tasks? This tests whether conversational problem-solving strategies emerge from meta-learning rather than direct instruction.
A surprising generalization result from the social meta-learning training paradigm. The training procedure uses only fully-specified problems — the student receives the complete problem statement from the first turn, and the teacher provides feedback during attempts to solve it. None of the training problems require the student to handle missing information. Yet the trained model performs significantly better on underspecified tasks at test time, where critical information is revealed only across multiple conversational turns.
The behavioral signature is specific: SML-trained models make fewer premature answer attempts and are more likely to ask for the information they need. They learn to recognize when they lack enough information to answer well and to extract that information from the conversation partner. This is the human pattern of "ask before answering when you're not sure" — emerging in an LLM that was never explicitly trained on the pattern.
The mechanism appears to be that SML training teaches the model a meta-strategy: use the conversation as a resource. This strategy generalizes from "use the conversation to refine an answer to a fully-specified problem" (training distribution) to "use the conversation to get missing information first, then answer" (test distribution). The student has learned not just to solicit corrective feedback but to model the conversation as a place where information flows.
The result can be sharpened with a two-stage training procedure called Q-priming. A preliminary SFT stage trains the model on dialogues where it has been explicitly prompted to ask questions, leveraging the teacher's private knowledge to generate good question examples. After Q-priming, online RL via SML refines the behavior further. The combined pipeline produces stronger clarifying-question behavior than either alone.
For conversational AI design, this is an existence proof: the structural skill of "ask before answering" can be installed via training rather than via runtime prompting. Systems that have struggled with the "LLM answers prematurely" failure mode can address it at the training level rather than relying on prompt engineering.
Related concepts in this collection
-
Can LLMs learn to ask for feedback during problem solving?
Explores whether language models can be trained to actively solicit corrective feedback mid-conversation rather than committing to single-turn answers. This matters because it could bridge the gap between fluent chat and genuine conversational learning.
same paper, the parent framework
-
Why does teacher-student information asymmetry enable learning signals?
What role does privileged answer access play in making social meta-learning training work? Without asymmetric information, can a conversation between teacher and student function as pedagogy or only as parallel speculation?
same paper, the training-time mechanism
-
Why can't conversational AI agents take the initiative?
Explores whether current LLMs lack the structural ability to lead conversations, set goals, or anticipate user needs—and what architectural changes might enable proactive dialogue.
directly addresses: the passivity problem this method solves
-
Why do models fail at asking good questions during interaction?
When models must actively seek information through questions rather than receive it passively, they struggle dramatically. This explores why GPT-4o plateaus at 35% accuracy and whether training or prompting can fix the underlying deficit.
adjacent: the benchmark for the capability SML produces
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
SML produces emergent clarifying-question behavior — models trained only on fully-specified problems learn to handle underspecified tasks by asking for missing information