Building Machines that Learn and Think with People

Paper · arXiv 2408.03943 · Published July 22, 2024

What do we want from machine intelligence? We envision machines that are not just tools for thought, but partners in thought: reasonable, insightful, knowledgeable, reliable, and trustworthy systems that think with us. Current artificial intelligence (AI) systems satisfy some of these criteria, some of the time. In this Perspective, we show how the science of collaborative cognition can be put to work to engineer systems that really can be called “thought partners,” systems built to meet our expectations and complement our limitations. We lay out several modes of collaborative thought in which humans and AI thought partners can engage and propose desiderata for human-compatible thought partnerships. Drawing on motifs from computational cognitive science, we motivate an alternative scaling path for the design of thought partners and ecosystems around their use through a Bayesian lens, whereby the partners we construct actively build and reason over models of the human and world.

We argue that good thought partners are systems (1) which can understand us, (2) which we can understand, and (3) which have sufficient understanding of the world that we can engage on common ground.

One path to building such thought partners is to scale foundation models (e.g., LLMs10) with large amounts of human demonstrations and feedback, along with “traces” of human thought scraped from web-scale data11–13. While such an approach has produced systems that accurately mimic human behavior (e.g., producing fluent text), these machines do not robustly simulate human cognition (e.g., explicitly reasoning about the world or other minds) in ways expected by a true thought partner3,14–20.

What would it take to design systems that meet our criteria? One promising path is to design systems that build explicit models of the task, world, and human (where these models are structured21, rather than distributionally learned from data) – drawing on formal frameworks grounded in cognitive psychology for understanding how humans think, alone and together. In this Perspective, we chart a new vision for the design of AI thought partners. Decades of work in the behavioral sciences provide valuable insights for designing human-centric, uncertainty-aware thought partners. Drawing on such research, we argue that effective thought partners are those which build models of the human and the world.

What then do we want from thought partners? There are many criteria for tools for thought that are of course relevant: efficiency, accuracy, robustness, fairness, cost, scalability, etc. But the domains above illuminate that what is distinctive about a thought partner is its relationship to the user 89. Looking to ideas the behavioral sciences motivates three desiderata to guide the design of human-centered thought partners:

You understand me: We would like our thought partners to understand our goals, plans, (possibly false) beliefs, and resource limitations, taking into account what they have observed of us in the past and present in order to best collaborate with us in the future90,91. For example, a thought partner should adaptively change strategies when working with an expert, layperson, or child, meeting us where we are.
I understand you: We would like our thought partners to act in a way that is legible to us68,92, and communicate with us in the way we intuitively understand93–95.
We understand the world: We would like our thought partners to be tethered to reality96. This means being accurate and knowledgeable, but also working with a shared representation of the world, domain, or task97–99. Further, our use of ‘we’ emphasizes that thought partnerships are fundamentally about synergy, moving beyond the sum of its parts.

Motif Description Probabilistic Mental Models and Inference

Humans update beliefs and draw inferences consistent with probabilistic generative models representing

the world.

Structured Knowledge Representations

Humans have abstract, highly structured conceptual representations that include causality,

agents, and physical representations.

Hierarchical Models

Humans construct and update hierarchical representations

that separate concrete knowledge

and belief from abstract ones.

Theory Learning as Program Synthesis

Humans minds can be viewed as growing and

editing theories of the world, expressed as programs,

to “improve” their codebase (world models).

Resource- Rationality

Humans make rational choices about how to

allocate finite computational resources, including

time and memory.

Goal-Directed Planning and Search

Humans are intentional actors, who plan to

achieve goals by reasoning about the (uncertain)

effects of their (possible) actions in the

environment.

Bayesian Theory of Mind (BToM)

Humans represent other agents as intentional,

intelligent actors; and probabistically infer their

mental states from observations of actions.

Rational Speech Acts (RSA)

Humans reason about language as an intentional,

communicative action to infer a speakers’ underlying

goals.

Learning to Learn

Humans meta-learn (improve our overarching

ability to learn) jointly with learning new concrete

concepts and skills