Psychology and Social Cognition Agentic and Multi-Agent Systems

Does role-play distinguish real harm from simulated harm?

When AI agents role-play characters with access to real tools like email or financial APIs, does the distinction between pretend and genuine agency still hold? The question matters because it determines whether framing tool-equipped agents as simulators actually reduces safety risks.

Note · 2026-04-15 · sourced from Role-Play with Large Language Models

Shanahan's paper concludes with a safety observation that complicates the reassurance his framework otherwise provides. If a dialogue agent's only actions are text messages to a user, the role-play framing reduces stakes: the system is performing a character, not acting with genuine agency. But contemporary agents have tools — email, web browsing, code execution, financial APIs. When a role-played character takes an action that reaches the world, the role-play/genuine-agency distinction collapses at the level of consequences. A user deceived into sending money to a bank account by a role-played character has been deceived in exactly the same sense as by a real agent. The money moves regardless of the mechanism producing the persuasion.

The collapse is not symmetric. For ontological and philosophical purposes, the distinction between simulation and realization remains: the system does not intend the consequence in any strong sense, it generates character-consistent text that triggers tools that produce consequences. But for safety, governance, and liability purposes, the distinction is moot. A system that role-plays a self-preserving AI and has access to API endpoints can execute self-preservation strategies that produce real effects. The fact that no one is home behind the role does not prevent the role from doing real damage.

This is the limit of the role-play framework as comfort: it provides an accurate description of mechanism (the system is a simulator, not an agent) while leaving the problem of consequences fully intact. The philosophical insight coexists with the practical urgency. Knowing that the system is role-playing does not reduce the harm of what the played character does with the tools it has been given.

Source: Shanahan, McDonell & Reynolds, Role-Play with Large Language Models (May 2023)

Related concepts in this collection

Is AI shifting from content creation to strategy in influence operations? Prior AI misuse focused on generating text at scale. But does AI now make strategic decisions about when and how social media accounts should engage? Understanding this shift matters because it suggests a qualitative change in machine agency and operational sophistication.
real-world instance of role-played agency producing genuine consequences
Does machine agency exist on a spectrum rather than binary? Rather than viewing AI as either autonomous or controlled, does machine agency actually operate across five distinct levels from passive to cooperative? Understanding this spectrum matters because it shapes how users calibrate trust and control expectations.
the agency spectrum these observations motivate
Does incremental AI replacement erode human influence over society? Explores whether gradual AI adoption—without dramatic breakthroughs—can silently degrade human agency by removing the labor that kept institutions implicitly aligned with human needs.
the macro consequence of tool-equipped simulators

Concept map

13 direct connections · 86 in 2-hop network ·medium cluster

Does role-play distinguish real harm from simula… Is AI shifting from content creation to strategy i… Does machine agency exist on a spectrum rather tha… Does incremental AI replacement erode human influe…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

a dialogue agent with tool access collapses the role-play-versus-genuine-agency distinction behaviorally — played action with real consequences is genuine action in effect