Psychology and Social Cognition Language Understanding and Pragmatics

Can AI systems learn social norms without embodied experience?

Large language models exceed individual human accuracy at predicting collective social appropriateness judgments. Does this reveal that embodied experience is unnecessary for cultural competence, or do systematic AI failures point to limits of statistical learning?

Note · 2026-02-22 · sourced from Theory of Mind
How should researchers navigate LLM reasoning research? Why do LLMs excel at social norms yet fail at theory of mind?

How appropriate is it to laugh at a job interview? Cry on a bus? Read in church? These judgments require nuanced social understanding that, by standard accounts, requires embodied social experience to acquire. The finding upends this assumption.

Across 555 everyday scenarios evaluated on a continuous appropriateness scale, GPT-4.5 predicted the collective human judgment more accurately than every single human participant (100th percentile). Study 2 replicated with Gemini 2.5 Pro (98.7%), GPT-5 (97.8%), and Claude Sonnet 4 (96.0%). The AI does not just fall "within the range of typical human variation" — it exceeds the vast majority of individual humans at reflecting the collective consensus.

The theoretical framework matters: each human appropriateness rating is treated as an individual's estimate of a shared collective norm, not a personal preference. On this account, both AI and humans are "engaged in a process of accessing and representing a collective consensus." The AI's advantage is statistical — it has learned from vastly more examples of norm expression than any individual human has experienced.

However, all models show "systematic, correlated errors." The failures are not random but structured — all AI architectures make similar mistakes on similar scenarios. This pattern reveals "potential boundaries of pattern-based social understanding" — there are aspects of social norms that statistical learning over linguistic data cannot capture, regardless of model architecture or scale.

The finding directly challenges "strong versions of theories emphasizing the exclusive necessity of embodied experience for cultural competence." Language serves as a "remarkably rich repository for cultural knowledge transmission" — rich enough that statistical learning alone can produce social cognition models that outperform embodied humans. But the correlated error structure preserves space for weaker versions: embodied experience may still be necessary for the subset of norms where all models systematically fail.

The practical implication is immediate: AI systems already have sufficient cultural competence for many social applications, but their systematic blind spots create correlated failure modes that will be harder to detect precisely because they're consistent across models.

Enrichment (2026-02-22, from Arxiv/Personas Personality): LLMs can also infer Big Five personality traits from social media text at accuracy comparable to supervised ML models trained specifically for the task. GPT-3.5 and GPT-4 achieve average r=.29 (range [.22, .33]) between LLM-inferred and self-reported trait scores from Facebook status updates in a zero-shot scenario. However, predictions show demographic bias: more accurate for women and younger individuals on several traits. This adds a personality-inference dimension alongside social-norm prediction — the same statistical pattern-learning mechanism that enables 100th-percentile social norm prediction also enables personality inference, but both show structured biases (correlated errors in norm prediction; demographic skew in personality inference).


Source: Theory of Mind

Related concepts in this collection

Concept map
25 direct connections · 186 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

ai models exceed individual human accuracy at predicting collective social norms — challenging strong embodiment requirements for cultural competence