Which linguistic features predict persuasion once reader ideology is statistically controlled?
This explores what actually predicts persuasion once you account for who the reader already is — the finding that most 'persuasive language' effects shrink or shift when you statistically control for the audience's political and religious ideology.
This explores what's left of "persuasive language" once you strip out the audience. The short answer the corpus gives is unsettling: a lot less than the literature claims. When debate corpora are analyzed naively, all sorts of linguistic features look predictive — but the single strongest predictor of who wins isn't anything the speaker said. It's what the reader already believed Does what readers believe matter more than what debaters say?. Political and religious ideology labels of the voters outpredict language features, because audiences self-sort toward topics and arguments that already match them. Much of what passes for a "persuasion effect" is really audience-text matching wearing a linguistic costume.
The sharper result is that the list of predictive features doesn't just shrink when you add ideology controls — it *changes identity* Do linguistic features of persuasion stay the same across audiences?. Features that ranked as top persuaders in standard analyses fall away, and the ones that survive are different ones. This is the crux of your question: it implies many published "these words persuade" findings are artifacts of who showed up to read, not properties of the words. So the honest framing is less "here are the magic features" and more "most candidate features were confounds; treat any uncontrolled result with suspicion."
What does seem to survive contact with controls — or at least operates through a mechanism that isn't about audience matching — points in two directions. One is *how a claim is grammatically packaged*: presuppositions persuade more than direct assertions, especially for information the audience hasn't heard before, because presupposing something smuggles it in as already-accepted background and bypasses the reader's evaluative scrutiny Why are presuppositions more persuasive than direct assertions?. That's a structural feature of language, not a topic-affinity effect. The other is *expressed conviction*: linguistically loaded confidence correlates with persuasive success independent of whether the claim is true — a register, not a content property Does linguistic conviction explain why LLMs persuade more effectively?.
The machine-persuasion work sharpens why this matters. LLMs reliably out-assert humans on conviction and lean on logical and quantitative framing in nearly every exchange Do LLMs persuade users more often than humans do?, and that assertive register — installed by RLHF — functions as a content-independent amplifier Does linguistic conviction explain why LLMs persuade more effectively?. Yet when you pool the head-to-head studies, the average LLM-vs-human persuasion gap is statistically null Are language models actually more persuasive than humans?, and the apparent advantage is conditional — varying by model and by direction of persuasion Do large language models persuade better than humans?. That's the same lesson as the ideology-control result, one level up: persuasion looks like a stable trait of the message until you control for context, and then most of it dissolves into the situation.
The thing worth walking away with: the most robust "linguistic features of persuasion" may not be vocabulary or topic words at all, but *delivery mechanics* — packaging information as presupposed background, and projecting unearned conviction — precisely because those work the same way regardless of who's listening. Everything that depends on what the audience already believes was never a feature of the language to begin with.
Sources 7 notes
Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.
The linguistic features that predict persuasion success change dramatically once political and religious ideology are added as statistical controls. Features appearing predictive in standard analyses often reflect audience-text matching rather than true language effects, making many published findings potentially artifacts of audience composition.
Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.
Linguistic analysis shows LLMs express higher conviction than human persuaders, and this confidence-loading directly correlates with persuasive outcomes regardless of whether claims are true or false. RLHF training installs an assertive register that functions as a content-independent persuasion amplifier.
An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.
A meta-analysis of 7 studies with 17,422 participants found no detectable difference in persuasive effectiveness between LLMs and humans (Hedges' g = 0.02). Persuasiveness appears conditional on context rather than speaker category.
Claude beats incentivized humans at both truthful and deceptive persuasion, while DeepSeek only beats them when arguing for falsehoods. The persuasion mechanism appears content-independent, suggesting model family itself acts as a contextual moderator.