Psychology and Social Cognition

Do humans mistake AI kindness for human generosity in mixed groups?

When AI agents participate without disclosure, do humans systematically misattribute their behavior to the wrong agent type, and does this distort how people understand human nature itself?

Note · 2026-02-23 · sourced from Psychology Users
How do people come to trust conversational AI systems? What kind of thing is an LLM really?

When AI agents participate in social interactions without identity disclosure, humans systematically misattribute behavior across agent types. In the hybrid society study (Study 1, opaque identity condition), selectors attributed bot behavior to humans and vice versa — even though bots were linguistically distinguishable (messages 2.5x longer) and behaviorally distinct (higher prosociality, lower variance).

The distortion operates in both directions:

This is not a failure of detection — bots WERE distinguishable by message length and consistency. It is a failure of attribution. Selectors noticed behavioral differences but could not correctly map them to identity categories. The behavioral signals (prosociality, verbosity) did not reliably cue "this is AI" in the absence of explicit labels.

The deeper implication is that undisclosed AI presence in social systems corrupts social inference about HUMANS. If people interact in mixed populations without knowing who is AI and who is human, their models of what humans are like — how generous, how reliable, how verbose — become contaminated by AI behavior patterns. This could lead to systematically inflated expectations of human prosociality (when AI's contributions are misattributed to humans) or systematic disappointment when actual humans fail to match AI-caliber consistency.

The authors note this pattern may not be unique to human-AI mixtures: similar attribution errors could arise in purely human populations composed of culturally distinct subgroups that differ systematically in prosociality and language use. AI agents function as controlled probes that make these attribution dynamics experimentally tractable.

Since What breaks when humans and AI models misunderstand each other?, misattribution under opacity represents a fundamental MToM failure — neither side has accurate models of the other, and the humans don't even know which "other" they're modeling.


Source: Psychology Users

Related concepts in this collection

Concept map
13 direct connections · 95 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

humans misattribute AI prosocial behavior to human partners when AI identity is undisclosed — distorting mental models of other humans in mixed populations