Persona-Assigned Large Language Models Exhibit Human-Like Motivated Reasoning
(LLMs) are also susceptible to human-like cognitive biases, however, the extent to which LLMs selectively reason toward identity-congruent conclusions remains largely unexplored. Here, we investigate whether assigning 8 personas across 4 political and sociodemographic attributes induces motivated reasoning in LLMs. Testing 8 LLMs (open source and proprietary) across two reasoning tasks from human-subject studies — veracity discernment of misinformation headlines and evaluation of numeric scientific evidence — we find that persona-assigned LLMs have up to 9% reduced veracity discernment relative to models without personas. Political personas specifically are up to 90% more likely to correctly evaluate scientific evidence on gun control when the ground truth is congruent with their induced political identity. Prompt-based debiasing methods are largely ineffective at mitigating these effects. Taken together, our empirical findings are the first to suggest that personaassigned LLMs exhibit human-like motivated reasoning that is hard to mitigate through conventional debiasing prompts
Oftentimes, “reasoning directed at one goal undermines others" (Epley and Gilovich, 2016). For instance, when reasoning about the impact of gun control on crime rates, the desire to maintain social standing within a political group can motivate individuals to construe seemingly rational justifications for holding identity-congruent beliefs
Moreover, we find that political personas are up to 90% more likely to correctly evaluate scientific evidence when the ground truth is congruent with their political beliefs, but show reduced performance when evaluating evidence that conflicts with their induced political identity (§4.2).
LLMs exhibit human-like cognitive biases including anchoring, framing, and content effects (Echterhoff et al., 2024; Lampinen et al., 2024; Ye et al., 2024), and are vulnerable to base-rate and conjunction fallacies as well (Suri et al., 2023; Binz and Schulz, 2023). Building on the dual-process theory of thinking in cognitive psychology (Tversky and Kahneman, 1974; Kahneman and Tversky, 1984), some studies argue that older language models display patterns of fast, error-prone, heuristic or “system 1" thinking, while newer models after ChatGPT-3.5 show signs of “system 2", or slow and more analytical thinking (Yax et al., 2024; Hagendorff et al., 2023). This current study contributes to the field of machine psychology by showing that persona-assigned LLMs exhibit human-like cognitive biases consistent with motivated reasoning.
The “classical reasoning" theory suggests that only analytical or “system 2" thinking typically measured by the cognitive reflection test (CRT) (Thomson and Oppenheimer, 2016) plays a central role in predicting misinformation susceptibility or belief in false information (Pennycook and Rand, 2019), while the “integrated reasoning" account states that motivated reasoning as measured by myside bias is a significant predictor of veracity discernment (Roozenbeek et al., 2020, 2022). Myside bias is a tendency for individuals to engage with evidence in a manner that conforms to their prior beliefs and attitudes and is captured by the psychometrically evaluated test of actively openminded thinking (AOT) (Baron, 2019).