Turning large language models into cognitive models
ask whether large language models can be turned into cognitive models. We find that – after finetuning them on data from psychological experiments – these models offer accurate representations of human behavior, even outperforming traditional cognitive models in two decision-making domains. In addition, we show that their representations contain the information necessary to model behavior on the level of individual subjects. Finally, we demonstrate that finetuning on multiple tasks enables large language models to predict human behavior in a previously unseen task. Taken together, these results suggest that large, pre-trained models can be adapted to become generalist cognitive models, thereby opening up new research directions that could transform cognitive psychology and the behavioral sciences as a whole.
We show that this approach can be used to create models that describe human behavior better than traditional cognitive models. We verify this result through extensive model simulations, which confirm that fine-tuned language models indeed show human-like behavioral characteristics. Furthermore, we find that the embeddings obtained from such models contain the information necessary to capture individual differences. Finally, we highlight that a model fine-tuned on two tasks is capable of predicting human behavior on a third, hold-out task.
We considered two paradigms that have been extensively studied in the human decision-making literature for our initial analyses: decisions from descriptions [Kahneman and Tversky, 1972] and decisions from experience [Hertwig et al., 2004]. In the former, a decision-maker is asked to choose between one of two hypothetical gambles like the ones shown in Figure 1b. Thus, for both options, there is complete information about outcome probabilities and their respective values. In contrast, the decisions from experience paradigm does not provide such explicit information. Instead, the decision-maker has to learn about outcome probabilities and their respective values from repeated interactions with the task as shown in Figure 1d.