Building a Stronger CASA: Extending the Computers Are Social Actors Paradigm
The computers are social actors framework (CASA), derived from the media equation, explains how people communicate with media and machines demonstrating social potential. Many studies have challenged CASA, yet it has not been revised. We argue that CASA needs to be expanded because people have changed, technologies have changed, and the way people interact with technologies has changed. We discuss the implications of these changes and propose an extension of CASA. Whereas CASA suggests humans mindlessly apply human-human social scripts to interactions with media agents, we argue that humans may develop and apply human-media social scripts to these interactions. Our extension explains previous dissonant findings and expands scholarship regarding human-machine communication, human-computer interaction, human-robot interaction, human-agent interaction, artificial intelligence, and computer-mediated communication.
As a result, when humans are mindlessly interacting with media, they do not necessarily implement social scripts associated with human-human interactions as predicted by CASA. Instead, given a deeper and broader realm of experience, humans may implement scripts they have developed for interactions specific to media entities.
To support our argument, we will investigate the existing CASA literature. First, we will explain how technologies can serve as social actors and what characteristics cue human users to their social potential. Next, we will introduce CASA and review research within the paradigm, with a particular focus on studies with theoretical implications for HMC.
To engage with CASA, it is necessary to clarify the scope of the framework and its boundary conditions. Importantly, CASA does not apply to every machine nor every social technology; Nass and colleagues have described two essential criteria for a technology that serve as boundary conditions for CASA’s application. The first is social cues. Nass and Moon (2000) stated that “individuals must be presented with an object that has enough cues to lead the person to categorize it as worthy of social responses” (p. 83). Although this implies a boundary condition of “enough” social cues, unfortunately it is not a clearly defined one. Gambino, Fox, and Ratan 73 Given that perceptions of social potential vary from person to person and situation to situation (Waytz et al., 2010), we cannot establish objective, universal parameters of what constitutes “enough” cues. For example, shapes resembling eyes and a mouth can be sufficient to trigger a social response from a baby, but given adults’ more sophisticated brains, they may not perceive the same set of shapes as indicating social potential.
The second requisite characteristic is sourcing. Nass and Steuer (1993) clarified that CASA tests “whether individuals can be induced to make attributions toward computers as if the computers were autonomous sources” (p. 511). Even in naming their framework, Nass and colleagues make a meaningful choice in declaring that “computers are social actors.” This distinction is important because computers and technologies in general often serve as channels or conduits for human-human communication. The ability to enact and be perceived as a source of communication, rather than merely transmit it, indicates that a technological artifact has a degree of agency and is more than merely a channel (e.g., Sundar & Nass, 2000). Thus, for the sake of clarity and specificity, we conceptualize the types of technologies relevant to CASA as media agents. We define a media agent as any technological artifact that demonstrates sufficient social cues to indicate the potential to be a source of social interaction.
When media depict social characteristics, humans treat them in a social manner rather than exerting the cognitive effort to determine how to respond (Reeves & Nass, 1996). Thus, people will assign computers personality traits, apply stereotypes and norms, and make judgments and inferences as if the computers were human, even though they understand that computers are not human (Reeves & Nass, 1996).
The heuristic activation of social scripts underlies the CASA framework. As Nass and Moon (2000) argued:
We can conclude that individuals are responding mindlessly to computers to the extent that they apply social scripts—scripts for human-human interaction— that are inappropriate for human-computer interaction, essentially ignoring the cues that reveal the essential asocial nature of the computer. (p. 83)
These changes provide historical rationale to readdress CASA. Additionally, considerable evidence suggests that humans perceive media agents differently from how they perceive humans (e.g., Blascovich et al., 2002; Fox et al., 2015; Krämer et al., 2012). For example, people have distinct initial expectations for interactions with media agents (Edwards, 2018; Edwards et al., 2019; Spence et al., 2014). Different expectations and responses to humans and media agents can serve as evidence to simply refute CASA; that is, we may not treat media agents like people. Alternatively, we consider that through more social, frequent, and ongoing interactions with media agents, people may develop and apply specific scripts for interactions with media agents.
Extending CASA to incorporate scripts derived from human-media agent interaction addresses counterintuitive findings, accounts for the sociotechnological changes of the last three decades, and broadens CASA’s theoretical scope. Arguably, humans need different mental models and scripts specific to media agents, or social phenomena related to the media agent, to best handle unmet expectations and maximize their own efficiency navigating novel conditions of interactions with media agents (e.g., media agents lack feelings). Results of longitudinal studies provide additional evidence that humans develop and apply scripts for interactions with media agents. Responses to social cues change upon multiple interactions with media agents, which suggests the development of scripts, and the resultant responses are systematic (Baxter et al., 2017; Bickmore & Picard, 2005; Kim & Lim, 2019; Krämer et al., 2011; M. K. Lee et al., 2012; Pfeifer & Bickmore, 2011). The systematic responses to social cues, post-change, suggest that media-derived scripts, just like human-human scripts, are applied mindlessly in interactions with media agents. Over time, we have learned to acknowledge media agents and their affordances in interactions, and we have developed more nuanced scripts for interactions with media agents. Thus, we suggest extending CASA to include scripts derived from interactions with media agents.
When CASA’s assertions were being formulated, human-media agent interactions were rare and lean compared to the current landscape in which media are pervasive and rich. At that time, the presentation and experience of social affordances were more limited because of the technology powering media agents. Advances in technologies such as natural language processing, neural-networks, and raw computing power allow for social affordances of modern media agents to manifest in a wider variety of forms and aptitudes. For example, unique data collection and processing power allows media agents to personalize at a qualitatively different level than humans. Hence, the study of social responses should not be restricted by a focus on human correlates or similarities. In this way, researchers can also avoid reifying face-to-face communication as the gold standard for HMC and being constrained by the limitations of human interactions (e.g., Fortunati, 2005; Spence, 2019). Instead, researchers can explore why communication with a media agent may be preferred over communication with a human. Removing this anthropocentric bias from CASA should allow researchers a means to avert their own human-centric biases and avoid the pitfall highlighted by Groom and Nass (2007): “While trying to make robots human, researchers have sometimes overlooked what makes robots special” (p. 494).