Language Understanding and Pragmatics

Does AI fact-checking actually help people spot misinformation?

An RCT tested whether AI fact-checks improve people's ability to judge headline accuracy. The results reveal asymmetric harms: AI errors push users in the wrong direction more than correct labels help them.

Note · 2026-02-23 · sourced from Sentiment Semantics Toxic Detections
Where exactly does language competence break down in LLMs? How do people come to trust conversational AI systems? How should researchers navigate LLM reasoning research?

A preregistered RCT tested AI fact-checks (from a popular AI model) on political news headlines. The overall finding: AI fact-checking does not significantly affect participants' ability to discern headline accuracy or share accurate news. But the errors are asymmetric and harmful.

The asymmetry: when the AI mislabels true headlines as false, participants decrease their belief in those true headlines. When the AI expresses uncertainty about false headlines, participants increase their belief in those false headlines. The AI's mistakes are not neutral — they actively push users in the wrong direction on both ends.

The opt-in finding is equally concerning. When participants are given the choice to view AI fact checks and choose to do so, they become significantly more likely to share both true and false news — but only more likely to believe false news. Self-selection into AI assistance does not indicate sophistication; it correlates with increased vulnerability to misinformation.

This connects to the overreliance literature through a specific mechanism: users are not simply trusting AI outputs — they are using AI outputs as replacement signals for their own judgment. When the AI says "false," the user's prior belief in a true headline is overridden. The user delegates the epistemic work rather than using the AI as one input among many.

The practical implication is severe for AI deployment in information integrity contexts. An AI fact-checker that is "reasonably" accurate but imperfect creates a false safety net. Users who rely on it perform worse than users who rely on their own judgment, because the mislabeling errors have outsized influence. The asymmetry means AI fact-checking is net harmful unless accuracy exceeds a threshold where mislabeling damage is offset by correct labeling benefit — and the paper suggests current AI is below that threshold.


Source: Sentiment Semantics Toxic Detections

Related concepts in this collection

Concept map
16 direct connections · 152 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

AI fact-checking creates asymmetric harm through mislabeling — users decrease belief in true headlines labeled false and increase belief in false headlines labeled uncertain