Why can't users and AI articulate shared goals together?

This explores why goal-setting between a person and an AI is a *joint* failure — the user can't fully say what they want, the AI won't probe to help them find out, and so a shared target never gets built between them.

This reads the question as being about a two-sided breakdown, not a one-sided one. The interesting finding in the corpus is that articulating goals isn't something either party can do alone — it's supposed to happen *between* them, and that's exactly the part that's broken. On the human side, intent doesn't exist fully formed waiting to be typed out; it matures through interaction, resolving constraints bit by bit with stops and starts (How do users actually form intent when prompting AI systems?). The 'gulf of envisioning' names the trap directly: users can't articulate what they want, and AI doesn't help them discover it (Why can't users articulate what they want from AI?). Goal articulation is a collaborative act that neither side currently performs.

The reason the AI doesn't close the gap is structural, not a matter of capability. Conversational models are trained to respond to queries, not to lead — they can't initiate topics, hold their own goals, or steer toward a shared one (Why can't conversational AI agents take the initiative?). Next-turn reward optimization actively removes initiative from the model (Why do AI agents fail to take initiative?). So when a user shows up with half-formed intent, the AI mirrors the vagueness back instead of probing it. The numbers are stark: agents fully align with user intent only about 20% of the time, and uncover fewer than 30% of preferences through active questioning, because they assume too early and ask too little (Why do AI agents miss most of what users actually want?).

What would a working version look like? The corpus reframes goal-setting as something agents must *actively structure* — deciding what to share, what to ask, and what to infer, rather than passively waiting and presuming a shared understanding already exists (Can AI agents communicate efficiently in joint decision problems?). Conversation analysis offers a concrete mechanism: 'insert-expansions,' the small clarifying detours humans naturally take to scope a request before acting, which prevent misunderstanding instead of recovering from it after a silent wrong turn (When should AI agents ask users instead of just searching?). The good news is that this proactivity is trainable — clarification-seeking behaviors jumped from near-zero to ~74% with reinforcement learning (Why do AI agents fail to take initiative?).

Here's the thing a curious reader might not expect: even perfect probing may not be enough, because shared goals require shared *grounding*. A true thought partner needs three things — mutual understanding, legibility, and a shared world model — which demands real cognitive architecture, not just more training data (What makes an AI a true thought partner, not just a tool?). And a deeper critique argues that an AI manipulating symbols without contact with the world can't guarantee its encoded goals actually correspond to what you mean by them (Can AI systems achieve real alignment without world contact?). Shared goals aren't just words you both agree on — they're anchored in a shared reality the AI may not have access to.

So the answer to 'why can't they articulate shared goals together' is layered: the human's intent is still forming, the AI is built to react rather than co-create, and even a proactive AI may lack the common ground that makes a goal genuinely *shared*. The frontier worth watching is whether collaboration can be designed in — through structured dialogue and trained initiative — rather than waited for.

Sources 9 notes

How do users actually form intent when prompting AI systems?

Human intent matures through progressive constraint resolution with fluctuating stability, not as a simple present-or-absent condition. The STORM framework and Clarify metric reveal that AI systems fail partly because they cannot access users' internal cognitive states during this evolution.

Why can't users articulate what they want from AI?

Intent develops through interaction, not in isolation. Since AI models respond rather than probe, they miss opportunities to help users discover unarticulated requirements. Structured dialogue that presents model-generated options shifts the cognitive burden from open-ended envisioning to constrained evaluation.

Why can't conversational AI agents take the initiative?

Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.

Why do AI agents fail to take initiative?

Research shows next-turn reward optimization structurally removes initiative from models, but proactive behaviors like critical thinking and clarification-seeking are trainable (0.15% to 73.98% with RL). The core challenge is balancing proactivity with civility to avoid intrusion.

Why do AI agents miss most of what users actually want?

UserBench measured multi-turn interactions where users reveal goals incrementally and found models achieve full intent alignment just 20% of the time. Even top models uncover fewer than 30% of user preferences through active querying, suggesting passivity and premature assumption-making are systematic failures.

Can AI agents communicate efficiently in joint decision problems?

Human-AI collaboration on joint decisions demands that AI agents actively determine what to share, ask, and infer rather than passively respond. LLMs currently fail at this structured communication because they lack goal-driven initiative and build shared understanding rather than presuming it.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

What makes an AI a true thought partner, not just a tool?

Collins et al. show that thought partners require three reciprocal desiderata grounded in behavioral science: mutual understanding, legibility, and shared world models. This demands explicit cognitive architectures—Bayesian theory of mind, resource-rationality, goal planning—rather than scaling foundation models on human feedback alone.

Can AI systems achieve real alignment without world contact?

Peircean semiotics reveals that symbolic goal encoding without world contact and social mediation cannot guarantee correspondence to actual values. LLMs operating in pure symbol manipulation risk divergence between stated goals and real-world outcomes.

Why can't users and AI articulate shared goals together?

Sources 9 notes

Next inquiring lines