Machine Psychology

Paper · arXiv 2303.13988 · Published March 24, 2023
Theory of MindPhilosophy Subjectivity

we highlight and summarize theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks and focuses instead on computational insights that move us toward a better understanding and discovery of emergent abilities and behavioral patterns in LLMs.

In contrast, this review focuses on the class of approaches that directly study the behavior of LLMs, analyzing relationships between inputs and outputs instead of inspecting the inner workings. This approach includes not only analyses of static trained models, but also experimental manipulations of inputs both during and after training. It also encompasses analyses of inputs and outputs that reveal insights about internal mechanisms, even if those internal mechanisms are not directly inspected. For this set of approaches, experiments can be inspired by human psychology, cognitive science, and the behavioral sciences. This is what we want to term machine psychology (see Figure 1). Over several decades, the mentioned disciplines have developed a wide range of methods and frameworks to understand and characterize observable intelligent behaviors in human and non-human animals

We concentrate on four research areas that can inform distinct strands in machine psychology research: heuristics and biases, social interactions, the psychology of language, and learning. Apart from these four areas, there are, of course, multiple other domains of psychology that can also provide valuable paradigms for, for instance when investigating creativity in LLMs (Stevenson et al., 2022), clinical psychology (Li, Li, et al., 2022), moral behavior (Khandelwal et al., 2024), and others.

Traditionally, developmental psychology explores how humans develop cognitively, socially, and emotionally throughout their lives. This includes studying the various factors that influence development, such as social intelligence or social skills. By applying paradigms from this area of developmental psychology to LLMs, researchers can gain deeper insights into how these models manage complex social interactions. In particular, once LLMs are deployed as chat agents, they should become versed in modeling human communicators. Therefore, it is important to assess the level of social intelligence in LLMs. One example in this context is the application of theory of mind tests to LLMs, where researchers use tasks from human experiments, such as those famously conducted by Wimmer and Perner (1983) and Perner et al. (1987).

A long history of work has studied the psychology of how humans use and understand language, ranging from how they use semantic and syntactic features to understand a sentence to how they use pragmatic inferences in a discourse context to help interpret what someone has said. Correspondingly, a long-standing body of work has studied how language processing models capture these features of human language processing. Early connectionist works studied these topics in simple recurrent predictive models