Does telling people an AI wrote something actually stop them from believing it?
When audiences learn that AI created content, do they become skeptical enough to resist its persuasive pull? This explores whether disclosure works as a genuine defense against AI-driven persuasion or merely shifts how people process it.
The Thin Line ablation manipulated audience awareness of AI involvement across three groups (A, B, C) and measured both critical engagement and sway. Group A — unaware of AI — perceived the LLM as more competent. Groups B and C — aware or suspecting — were more critical of the arguments. But sway proportions across the three groups ranged from 34% to 62%, with the LLM still moving substantial fractions of audiences who knew an AI was involved. Disclosure raised scrutiny without collapsing effect.
This challenges the assumption baked into much policy thinking — that AI labeling functions like advertising disclosure, and that once disclosed, the persuasive force decays sharply. The Thin Line evidence suggests disclosure modulates the channel through which AI influence operates rather than blocking the channel altogether. Aware audiences shift toward central-route processing (more scrutiny, more counter-arguing) but counter-arguing does not zero out the persuasive content; it leaves a residual sway proportion that is still meaningful at scale.
This sharpens we lack a cultural position on AI-generated discourse — unlike advertising which we already discount. We have a fully developed cultural reflex for advertising — the disclosed-paid-content posture is decades old and supported by school curricula, regulation, and consumer literacy. We do not yet have an analogous reflex for AI-generated discourse. The gap is not just rhetorical; it is measurable in residual sway proportions when AI authorship is known.
It also connects to Do people prefer AI moral reasoning when they don't know the source?. The anti-AI bias is real but bounded — it raises the threshold for acceptance without making AI arguments inert. Combined: people prefer AI moral content when blind, become biased against it when revealed, and yet are still moved by it when revealed. Three findings, one design implication: disclosure is a necessary but not sufficient safety mechanism.
For writing about AI authorship and false-punditry, the operational point: a "this was written with AI" label is not a neutralizer. It is a critical-route activator with a partial residual. Designs that lean on disclosure as the primary defense should be paired with content-side interventions, not treated as complete on their own.
Source: Argumentation Paper: The Thin Line Between Comprehension and Persuasion in LLMs
Related concepts in this collection
-
How do we learn to read AI-generated text critically?
Publics have developed interpretive postures toward journalism, advertising, and scholarship over time. But AI discourse arrived too suddenly for any cultural discount to form, raising questions about how we might develop one.
measurable footprint of the missing cultural reflex
-
Do people prefer AI moral reasoning when they don't know the source?
Explores whether humans genuinely prefer AI-generated moral justifications or whether source knowledge changes their evaluation. This matters for understanding whether AI reasoning quality is underestimated in real-world deployment.
anti-AI bias is bounded, not categorical
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
audience awareness of AI involvement raises critical scrutiny but does not collapse persuasive effect — the AI-disclosure shield is partial