Language Understanding and Pragmatics Psychology and Social Cognition

Are language models actually more persuasive than humans?

Does the research evidence support claims that LLMs persuade more effectively than humans, or have we been cherry-picking studies to fit a narrative?

Note · 2026-05-02 · sourced from Argumentation
Does personalization in AI increase trust or manipulation risk? What actually constrains AI systems from behaving badly?

The Bilstein 2025 meta-analysis is the corrective to a literature that had been read selectively in both directions. Pooling 7 studies covering 17,422 participants, the random-effects estimate is Hedges' g = 0.02 (p = .53, 95% CI [-0.048, 0.093]). There is no detectable average difference between LLM and human persuasiveness. Egger's test flagged potential small-study effects but trim-and-fill imputed no missing studies, so publication bias is unlikely to be hiding a real effect.

Both popular framings lose their grip here. The AI-superpersuader alarm — that LLMs are systematically more persuasive than humans and therefore an emerging civic risk on that basis — is not supported by the pooled evidence. The dismissive counter — that LLMs are "just text" and therefore not particularly persuasive — is also not supported. Both stories pick studies. The pooled signal is parity.

The interesting number, though, is the heterogeneity: I² = 75.97%. More than three-quarters of between-study variance is real, not sampling noise. Persuasive effectiveness is conditional, not categorical. The right question is not whether LLMs are more persuasive on average, but under which conditions a particular LLM, in a particular conversational design, in a particular domain, outperforms or underperforms human comparators.

This reframes Where does AI's persuasive power actually come from?. The Levers paper documents which knobs modulate persuasiveness; Bilstein clarifies that those knobs operate against a baseline that is on average parity, not superiority. The post-training intervention is not "amplify a pre-existing advantage" — it is "create or destroy advantage on a study-by-study basis."

It also reframes Does RLHF training make models more convincing or more correct?: the sophistry effect is real but does not produce a uniform persuasion uplift across deployment contexts. It is local, conditional, and design-dependent.

For writing about AI persuasion, the headline shift: persuasion lives in the embedding context — model × design × domain — not in the speaker's category.


Source: Argumentation Paper: A meta-analysis of the persuasive power of large language models

Related concepts in this collection

Concept map
13 direct connections · 94 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

the pooled effect of LLM vs human persuasion is statistically null — the headline AI is more persuasive is an artifact of cherry-picked studies