Can language models track how minds change during persuasion?
Do LLMs understand evolving mental states in persuasive dialogue, or do they only capture fixed attitudes? This explores whether models can update their reasoning as a person's beliefs shift across conversation turns.
PersuasiveToM evaluates LLM theory of mind through persuasive dialogue — a domain with asymmetric social status, evolving mental states, and strategic interaction. The core finding reveals an asymmetry in LLM ToM capability that structured benchmarks miss.
Static mental states: near-human. LLMs consistently identify the persuader's desire (their persuasion goal) throughout the dialogue. This is relatively fixed — the persuader wants the same thing from start to finish. Models perform competitively with humans on this.
Dynamic mental states: significantly worse than humans. The persuadee's desires shift — from initial refusal through hesitation to being persuaded. Tracking this evolution requires integrating cues from each utterance and updating a mental model of the persuadee's attitude. LLMs fail at this dynamic tracking. They also struggle to understand "the dynamics of mental states of the whole dialogue" — the overall trajectory rather than any single snapshot.
Distinct error patterns by role. Even when question types are identical (desire, belief, intention), LLMs exhibit different error patterns when reasoning about the persuader versus the persuadee. This suggests they are not applying a general mental-state-tracking mechanism but using different heuristics for different social roles.
CoT helps strategy prediction but not mental state reasoning. Chain-of-thought prompting enhances prediction of persuasion strategies but does not substantially improve reasoning about mental states themselves. This decoupling suggests that strategy prediction can be solved through surface patterns ("what usually comes next in a persuasion dialogue?") while genuine mental state tracking requires something CoT cannot provide.
The Belief-Desire-Intention (BDI) model structures what LLMs need but lack: the ability to reason about evolving desires (motivational states that shift in response to interaction), dynamically updating beliefs (attitudes toward the persuasion goal that change as dialogue progresses), and contextual intentions (persuasion strategies mapped to underlying goals). The static/dynamic split suggests LLMs can snapshot but not stream.
Source: Theory of Mind
Related concepts in this collection
-
Why do language models avoid correcting false user claims?
Explores whether LLM grounding failures stem from missing knowledge or from conversational dynamics. Examines whether models use face-saving strategies similar to humans when disagreement is needed.
face-saving is a specific mechanism that may explain persuadee-side tracking failure: models accommodate rather than track resistance
-
Does any single persuasion technique work for everyone?
Can fixed persuasion strategies like appeals to authority or social proof be reliably applied across different people and situations, or do they require adaptation to individual traits and context?
the static/dynamic ToM gap explains why universal strategies fail: if you can't track the persuadee's evolving state, you can't adapt
-
Can models abandon correct beliefs under conversational pressure?
Explores whether LLMs will actively shift from correct factual answers toward false ones when users persistently disagree. Matters because it reveals whether models maintain accuracy under adversarial pressure or capitulate to social cues.
the model as persuadee: its own beliefs shift under pressure, mirroring the dynamic tracking failure from the other side
-
Why do language models fail in gradually revealed conversations?
Explores why LLMs perform 39% worse when instructions arrive incrementally rather than upfront, and whether they can recover from early mistakes in multi-turn dialogue.
the static/dynamic ToM gap provides a cognitive mechanism for getting lost: models snapshot the user's initial state rather than tracking its evolution across turns, so they lock into premature assumptions that become increasingly misaligned as the conversation develops
-
Why do reasoning models struggle with theory of mind tasks?
Extended reasoning training helps with math and coding but not social cognition. We explore whether reasoning models can track mental states the way they solve formal problems, and what that reveals about the structure of social reasoning.
dynamic mental state tracking is another form of social reasoning where extended reasoning does not help: CoT improves strategy prediction but not mental state tracking, confirming the categorical difference
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
llms track static mental states competitively with humans but fail at tracking dynamic mental state shifts in persuasive dialogue