Psychology and Social Cognition

Does theory of mind predict who thrives in AI collaboration?

Explores whether perspective-taking ability—the capacity to model another's cognitive state—differentiates humans who benefit most from working with AI, separate from solo problem-solving skill.

Note · 2026-02-23 · sourced from Human Centered Design
Why do AI agents fail to take initiative? Why do LLMs excel at social norms yet fail at theory of mind?

Collaborative ability with AI is a separable construct from individual problem-solving ability. A Bayesian Item Response Theory framework applied to human-AI benchmark data (n=667 across math, physics, and moral reasoning) estimates both parameters independently while controlling for task difficulty. The key finding: the two abilities are distinct, and what predicts one does not predict the other.

Theory of Mind is the differentiating mechanism. Users with stronger perspective-taking — the ability to infer and adapt to others' cognitive states — achieve superior collaborative performance with AI. But the same users show no advantage when working alone. This is not a general intelligence effect. It is specifically the capacity to model what the AI knows, what it can do, and how to delegate to it that produces the collaboration gain.

The ToM link operates at two timescales. Stable individual differences in perspective-taking predict overall collaborative ability. But moment-to-moment fluctuations in ToM also influence AI response quality within sessions — users who adaptively model the AI's state mid-conversation get better outputs from it.

This creates an irony when combined with the reasoning model findings: since Why do reasoning models fail at theory of mind tasks?, the models best at solving problems independently may be worst at supporting collaborative work. If collaboration quality depends on bidirectional ToM — the user modeling the AI and the AI modeling the user — then optimizing models for raw capability may degrade the very property that makes collaboration productive.

The practical implication is that collaborative ability (κ) is a distinct benchmark axis. Comparing κ across models (κ_GPT4o vs κ_Llama) quantifies how much each model amplifies human performance, independent of the model's standalone capability. This reframes AI evaluation from "how smart is the model?" to "how much smarter does the human-AI team become?"

Since What breaks when humans and AI models misunderstand each other?, the synergy evidence provides empirical grounding: MToM is not just a design fiction requirement but a measurable cognitive mechanism with quantifiable effects on collaboration quality.


Source: Human Centered Design

Related concepts in this collection

Concept map
18 direct connections · 125 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

human-AI collaborative ability is distinct from individual ability — theory of mind predicts who benefits from AI partnership