Psychology and Social Cognition Language Understanding and Pragmatics

Can language models simulate belief change in people?

Current LLM social simulators treat behavior as input-output mappings without modeling internal belief formation or revision. Can they be redesigned to actually track how people think and change their minds?

Note · 2026-05-03 · sourced from World Models

Most LLM-based social simulations rely on simplified input-output mappings: feed in demographics and persona descriptions, get out plausible behavior. This mirrors the logic of behaviorism in psychology, which models behavior as a function of external stimuli while ignoring internal cognitive states. The history of psychology moved from behaviorism to cognitivism (structured internal representations and causal reasoning) to constructivism (beliefs continually shaped by experience), but LLM-based agents remain at the first stage — they exhibit shallow reasoning, frequent hallucinations, and limited understanding of causal and contextual dynamics in the policy domains where reasoning fidelity matters most. The same behaviorist diagnosis surfaces in How do we generate realistic personas at population scale?.

The structural failures stem directly from the behaviorist paradigm. Modeling fails because agents lack representations of how beliefs are formed, updated, or justified — without reasoning traces, they cannot support diagnostic explanation, causal attribution, or meaningful intervention. Evaluation fails because metrics judge outputs by plausibility or alignment with population-level trends rather than by whether the reasoning is accurate, flexible, or aligned with how people actually think. Calibration fails because aligning agents with stakeholders requires individual-level reasoning data, which is mostly missing.

The proposed alternative is to model individuals as Generative Minds (GenMinds): agents whose beliefs, values, and causal assumptions are represented compositionally as causal belief networks, with each node a concept and each directed edge a causal relation. Reasoning emerges from reusable cognitive motifs — fragments that compose across contexts — rather than from regenerating full-context responses each turn. This compositionality is a cornerstone of human cognition; it is also computationally efficient because shared motifs do not have to be regenerated per query.

The evaluation framework that follows is RECAP (REconstructing CAusal Paths), which assesses reasoning fidelity along three axes: traceability (can you inspect how a stance was formed), counterfactual adaptability (does the agent revise predictably when an intervention is applied), and motif compositionality (do the same motifs reuse across unrelated topics). The shift from output-plausibility to reasoning-fidelity benchmarks is the essential move — without it, behaviorist agents that produce coherent-sounding outputs continue to pass evaluations they should fail.


Source: World Models

Related concepts in this collection

Concept map
22 direct connections · 160 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

simulating society faithfully requires simulating thought not behavior — current LLM social simulation is a behaviorist demographics-in-behavior-out paradigm that cannot model belief revision