Language Understanding and Pragmatics

Do language models segment events like human consensus does?

Can GPT-3 identify event boundaries in narrative text the way humans do? This matters because it could reveal whether language models and human cognition share similar predictive mechanisms for understanding continuous experience.

Note · 2026-02-23 · sourced from Cognitive Models Latent

Humans perceive continuous experience as discrete events — "restaurant visits" and "train rides" — with identifiable boundaries. Studying event cognition requires these boundaries to be annotated, typically crowd-sourced from large behavioral samples. GPT-3, prompted with instructions similar to those given human participants, segments continuous narrative text into events that correlate significantly with human annotations. More strikingly, GPT-3's boundaries are closer to the human consensus (averaged across annotators) than boundaries from individual human annotators.

This is not just a practical finding about automating event annotation. It suggests a deeper parallel between next-token prediction and human event cognition. Event Segmentation Theory proposes that humans track ongoing events through predictive models that update at event boundaries — moments when prediction error spikes because the situation has changed. Next-token prediction in language models follows an analogous structure: the model continuously predicts what comes next, and event boundaries correspond to points of high predictive uncertainty.

The "closer to consensus" finding has an elegant explanation: individual human annotators bring idiosyncratic biases (personal experience, attention fluctuations, interpretation differences). The consensus is obtained by averaging across annotators, canceling out individual noise. GPT-3, trained on massive text corpora, may have already averaged across the distributional regularities of many human writers' event descriptions — effectively pre-computing the consensus through training.

However, this may also reflect a limitation. Since Why do language models fail at communicative optimization?, the event segmentation capability may be a statistical regularity (event boundaries correspond to distributional shifts in text) rather than genuine event understanding. A model could identify event boundaries purely from lexical and structural cues without any understanding of what events are.


Source: Cognitive Models Latent

Related concepts in this collection

Concept map
15 direct connections · 119 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

llms segment narrative events closer to human consensus than individual human annotators