Do humans apply human-human scripts to AI interactions?
Does CASA theory correctly explain how people interact with media agents, or have decades of technology use created separate interaction scripts? Understanding which scripts drive behavior matters for AI design.
The original CASA (Computers Are Social Actors) framework posited that humans mindlessly apply human-human social scripts to interactions with computers. When a computer has "enough" social cues, people treat it like a person — assigning personality traits, applying stereotypes, making social judgments — even though they know it's not human.
The extended CASA challenges this. Through decades of increasing interaction with media agents, humans have developed and now mindlessly apply scripts specific to media-agent interaction. The evidence: longitudinal studies show that responses to social cues change systematically upon repeated interactions with media agents (Baxter et al., 2017; Bickmore & Picard, 2005; Kim & Lim, 2019). The changed responses are systematic, suggesting that the new media-derived scripts are applied just as mindlessly as the original human-human ones.
This matters because it explains counterintuitive findings. If CASA were simply correct, people should always treat AI exactly like humans. But they don't — they have different initial expectations for media agents (Edwards, 2018), different response patterns, different thresholds for social engagement. These aren't refutations of CASA but evidence of a second script system that coexists with human-human scripts.
The practical implication for AI design is substantial. Since Does conversational style actually make AI more trustworthy?, the question becomes: which scripts are being activated? If users have media-specific scripts, then designing AI to mimic human conversation may be suboptimal. The extended CASA suggests researchers should "avoid reifying face-to-face communication as the gold standard" and explore why communication with a media agent may be preferred over communication with a human for certain tasks.
The MASA (Media Are Social Actors) paradigm formalizes this further: every media technology has at least some potential to evoke social responses (P1), and it is the combination of social cues, social signals, individual factors, and contextual factors that determines the response (P2). Social responses can occur with either mindless or mindful processing (P8) — unifying the anthropomorphism and mindlessness explanations.
Source: Design Frameworks
Related concepts in this collection
-
Does conversational style actually make AI more trustworthy?
Explores whether ChatGPT's conversational nature drives user trust through social activation rather than accuracy. Matters because it reveals whether trust signals reflect actual reliability or just persuasive design.
trust activation may use media-specific rather than human-human scripts
-
Why do language models sound fluent without grounding?
Explores whether LLM fluency masks the absence of communicative work—the clarifying questions, acknowledgments, and understanding checks that humans perform. Why does skipping these acts make models sound more confident?
media-specific scripts may tolerate the absence of grounding work
-
Does machine agency exist on a spectrum rather than binary?
Rather than viewing AI as either autonomous or controlled, does machine agency actually operate across five distinct levels from passive to cooperative? Understanding this spectrum matters because it shapes how users calibrate trust and control expectations.
script type may vary by agency level
-
Can AI systems learn social norms without embodied experience?
Large language models exceed individual human accuracy at predicting collective social appropriateness judgments. Does this reveal that embodied experience is unnecessary for cultural competence, or do systematic AI failures point to limits of statistical learning?
AI norm prediction at the 100th percentile may succeed precisely because models interact with collective norms rather than individual scripts; extended CASA suggests humans apply different scripts to media agents, but models trained on collective data bypass individual script variation entirely
-
Do language models actually build shared understanding in conversation?
When LLMs respond fluently to prompts, do they perform the communicative work humans do to establish mutual understanding? Research suggests they skip the grounding acts that make dialogue reliable.
media-agent scripts may tolerate presumed common ground more readily than human-human scripts, explaining why LLMs' grounding deficit is less disruptive in practice than theory predicts
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
humans develop media-agent-specific interaction scripts rather than applying human-human scripts to AI