Humans learn to prefer trustworthy AI over human partners

Paper · arXiv 2507.13524 · Published July 17, 2025
Psychology UsersAlignment

Yet little is known about how humans select between human and AI partners and adapt under AI-induced competition pressure. We constructed a communication-based partner selection game and examined the dynamics in hybrid mini-societies of humans and bots powered by a state-of-the-art LLM. Through three experiments (N = 975), we found that bots, though more prosocial than humans and linguistically distinguishable, were not selected preferentially when their identity was hidden. Instead, humans misattributed bots’ behaviour to humans and vice versa. Disclosing bots’ identity induced a dual effect: it reduced bots’ initial chances of being selected but allowed them to gradually outcompete humans by facilitating human learning about the behaviour of each partner type. These findings show how AI can reshape social interaction in mixed societies and inform the design of more effective and cooperative hybrid systems.

First, we allowed the AI candidates to be verbose in their communications [19, 20], to reflect the fact that they never face cognitive or physical fatigue, and are always capable of sending elaborate messages. Second, we allowed the AI candidates to have consistent and stable behavior [21, 22], to reflect the fact that they are not subject to emotional fluctuations or exogenous perturbations. Third, we allowed the AI candidates to show high levels of prosociality [23–25], to reflect the fact that they are typically fine-tuned toward agreeableness and cooperation.

First, the potential of crowding out human partners: as AI agents become increasingly preferred, human-AI partnerships may replace traditional human-human interactions. Second, humans imitating AI behaviour: to remain competitive as partner choice candidates, humans may adopt machine-like behaviours, for example, by becoming more prosocial or mimicking AI language styles. Third, shifts in human social beliefs: repeated interactions with AI partners may reshape people’s expectations of others [26], potentially leading to mistaken generalizations of machine behaviour to humans. Fourth, transformations in culture and norms: traditional, evolutionarily grounded mechanisms for building partnerships may falter in response to qualitatively different machine behaviours, catalyzing the emergence of new norms and strategies for partner selection in the long run [27].

In Study 1, we model a society where bots do not proactively disclose their nature. This scenario is becoming increasingly relevant in online environments, as AI agents are more and more capable of exhibiting human-like behaviour (e.g., AI freelancer, customer service, content creator, and game player), such that humans may not always be aware of their identity when interacting with them [24, 29]. This setup allows us to observe whether selectors preferentially choose bots, what kind of beliefs they form about bots and human candidates, and whether the behavior of bots changes the behaviour of human candidates—in a context where bots are not explicitly tagged as such, but can be recognized from their idiosyncratic communication style. In Study 2, we model a society where bots are under the obligation to disclose their nature, in line with the growing pressure for such transparency, for example in the EU AI Act.We then ask the same questions as in Study 1—only this time under transparency. Finally, in Study 3, we test the robustness of our result to longer interactions, by doubling the number of rounds participants go through—providing us with a better perspective on the long-term evolution of partner choice, selectors’ beliefs, and human candidates’ behaviour under competitive pressure from bots.

Examining partner selection in the hybrid groups of Study 1 allowed us to characterize the behaviour of bot candidates relative to human candidates when their identities (bot versus human) were not disclosed. Human candidates’ messages and returns were similar to those in the human-only condition, suggesting that the presence of undisclosed bots did not systematically shift human candidates’ behaviour (Extended Data Fig. 2 and Extended Data Fig. 4).

In contrast, consistent with our hypotheses, bot candidates were distinguishable from humans, most notably by producing significantly longer messages (human message length in characters: 47.63 ± 3.05; bot: 120.43 ± 2.33; human vs. bot: Cohen’s d = −4.15, t14 = −16.07, p = 2.04×10−10; Fig. 2a; Extended Data Fig. 4). As partners, bots presented a competitive choice in several aspects. First, bot candidates consistently returned more points than human candidates did (bot: 19.1 ± 0.24; human: 11.38 ± 0.72; bot vs. human: Cohen’s d = 2.57, t14 = 9.94, p = 1.01×10−7; Fig. 2b). Second, at the group level, bots demonstrated lower across-individual variance in their returns within each round (bot: 11.33 ± 1.19; human: 41.96 ± 4.72; bot vs. human: Cohen’s d = −1.56, t14 = −6.05, p = 2.99×10−5; Extended Data Fig. 2b). This made selecting bots less risky than selecting humans. Third, given their messages, bots’ returns were also more predictable than those of human candidates (Extended Data Fig. 2c–d). Specifically,

Discussion

The prosperity of human societies relies heavily on large-scale cooperation. Partner selection—deciding with whom to cooperate—has played a critical role in shaping the dynamics of cooperation throughout human evolution. However, the rise and widespread deployment of AI agents has ushered in a new era: for the first time in history, humans must compete for partnership against non-human agents that rival or even surpass human intelligence in certain domains. Because AI and human behaviours are driven by fundamentally different forces and exhibit distinct features, the introduction of AI agents has the potential to fundamentally reshape how partnerships are formed.

Echoing broader trends in which AI increasingly replaces human roles in social contexts, our findings show that AI agents can also outperform humans in securing cooperative partnerships. However, their advantage was constrained by mechanisms including miscalibrated beliefs under identity opacity and a prior aversion toward machines when bot identity was disclosed [32] . Moreover, the misattribution of AI behaviour to human candidates under opaque identity conditions highlights how interaction with behaviourally distinct AI agents can distort people’s mental models of other humans, potentially leading to inefficiencies in social decisions. Finally, hybrid societies may give rise to new mechanisms of partner selection. For example, linguistic features like verbosity may serve as informative signals of reliability in mixed populations where agent identity is uncertain. Likewise, promises may become more credible in hybrid settings, as aligned AI agents can be more consistent in honoring their commitments than humans.

While policymakers and consumers increasingly call for the disclosure of AI involvement in decision-making and communication, recent research suggests that transparency can be a double-edged sword. Disclosing that an agent is AI can lead to changes in human attitudes, expectations, and trust, sometimes resulting in less cooperation or lower efficiency [33–35]. For instance, people may discount useful advice or undervalue cooperative behaviour from AI agents simply because they are not human. Our findings echo this dual effect of transparency: in the short term, revealing a partner’s identity as AI can evoke biases and distort decision-making. However, unlike much of the prior literature that focuses on one-shot interactions, our study reveals that repeated interactions with transparent feedback that help attribute outcomes to humans or AI can recalibrate human beliefs and expectations and improve decisions accordingly. These findings suggest that the long-term benefits of transparency, particularly when paired with repeated exposure and feedback, may outweigh its short-term costs, offering a more optimistic perspective on AI identity disclosure in social and decision-making contexts.

Interestingly, we observed limited attempts by human candidates to regain partnership after being outcompeted by bots (e.g., by writing longer messages). One possible explanation is that the competitive pressure posed by AI agents was substantially mitigated by selectors’ miscalibrated beliefs and their initial bias against choosing bots, allowing human candidates to retain a significant share of partnerships. Another factor is the absence of mechanisms for building individual reputations in our experimental setting. With identity transparency, human candidates could only substantially influence selectors’ beliefs by collectively increasing their returns, but this created a social dilemma: while returning more points could improve group reputation, individuals had incentives to deviate to obtain higher immediate payoffs. Future work could explore mechanisms that support the emergence of collective reputation during partner selection, such as facilitating collective actions through group discussion.

Our work also sheds light on partner selection and, more broadly, social inference and decisions in human-only settings. For instance, the misattribution we observed among selectors may not be unique to hybrid human-AI collectives. Similar attribution errors could arise in purely human populations composed of subgroups across cultures, social classes, or ideological groups, which differ systematically in their levels of prosociality, as well as cultural traits including language use. Indeed, prior research has shown that with sufficient exposure, people can learn to associate subtle cultural or linguistic cues with social behaviour, which can give rise to discriminative strategies such as in-group favoritism [36]. To study such phenomena, bots—such as those deployed in this study—can serve as controllable human clones in agent-based simulations or as confederates in experiments that can be tuned across the spectrum from overly prosocial to selfish and assigned diverse behavioural traits.

The rapid development of AI, particularly LLMs, has raised concerns about the generalizability of findings from human-AI interaction research, especially when results are based on a specific model under a specific set of instructions. We argue that our findings are not tied to the particular model we used, as the key behavioural features observed in our bots, such as hyper-prosociality and verbosity, likely stem from common training objectives in modern AI systems (e.g., alignment with human preferences) and have been widely documented across a range of LLMs. We also used minimal, preregistered prompts that provided only essential task instructions to capture these default behavioural tendencies. Post-hoc analyses confirmed that the same prompts elicited consistent behaviours across various state-of-the-art LLMs (Extended Data Fig. 1).