Can models learn to ask better clarifying questions through self-improvement?

This explores whether question-asking is a trainable skill that improves when models are rewarded for questions that lead to better answers. It matters because asking good clarifying questions could help AI systems handle underspecified user requests.

Synthesis note · 2026-06-03 · sourced from Self Refinement Self Consistency Feedback

Users leave important aspects unsaid, and asking questions could resolve the ambiguity — but models ask poor questions. STaR-GATE applies self-improvement (STaR) to question-asking itself: generate a synthetic dataset of 25,500 persona-task prompts simulating a Questioner conversing with a Roleplayer whose preferences are hidden; the Questioner asks questions to elicit preferences, and is then iteratively finetuned on the questions that increased the probability of high-quality responses (responses generated by an Oracle with access to the Roleplayer's latent preferences). After two iterations of self-improvement, the Questioner asks better questions and produces responses preferred over the initial model on 72% of tasks.

The keeper is that eliciting preferences is a trainable skill, improvable by self-play against simulated users — reward the questions that lead to better downstream answers, and question-asking improves without human-written question supervision. It targets the elicitation half of personalization that prompt-stuffing and persona-assignment skip.

This is a strong fit for Adrian's clarification/proactivity thread. It pairs with Can models learn to ask clarifying questions without explicit training? (emergent vs explicitly-rewarded question-asking) and addresses the deficit named by Why can't advanced AI models take initiative in conversation? — STaR-GATE trains the initiative that passive next-turn optimization suppresses.

Inquiring lines that use this note as a source 2

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

14 direct connections · 100 in 2-hop network ·medium cluster Open in graph ↗

Can models learn to ask better clarifying questi… Can models learn to ask clarifying questions witho… Which clarifying questions actually improve user s… How can models select the most informative questio…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can models learn to ask clarifying questions without explicit training? Do language models trained only on fully-specified problems spontaneously develop the ability to ask for missing information when facing underspecified tasks? This tests whether conversational problem-solving strategies emerge from meta-learning rather than direct instruction.
emergent vs explicitly-rewarded clarifying-question behavior
Which clarifying questions actually improve user satisfaction? Not all clarification helps equally. This explores whether asking users to rephrase their needs works as well as asking targeted questions about specific information gaps.
STaR-GATE trains for the useful kind of question this note distinguishes
How can models select the most informative question to ask? Explores whether simulating possible futures and scoring questions by information gain can identify which clarifying question would best reduce uncertainty—moving beyond just deciding whether to ask toward deciding what to ask.
both improve question quality; UoT via inference-time search, STaR-GATE via training

Can models learn to ask better clarifying questions through self-improvement?

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4