Language Understanding and Pragmatics Psychology and Social Cognition

Do users worldwide trust confident AI outputs even when wrong?

Explores whether the tendency to over-rely on confident language model outputs transcends language and culture. Understanding this pattern is critical for designing safer human-AI interaction across diverse linguistic contexts.

Note · 2026-02-21 · sourced from Philosophy Subjectivity

The cross-linguistic overreliance study shows that the well-documented tendency to over-trust confident LLM outputs is not an English-language or Western-cultural artifact. It is universal.

The LLM side: Models are cross-linguistically overconfident — they generate epistemic markers of certainty at higher rates than their accuracy warrants. But the pattern is linguistically sensitive: models produce the most markers of uncertainty in Japanese and the most markers of certainty in German and Mandarin. The models are tracking real linguistic norms for confidence expression across languages, but they are doing so while systematically overconfident in accuracy.

The user side: Users in all languages rely on confident outputs even when those outputs are wrong. The reliance rate varies cross-linguistically — Japanese users rely significantly more on expressions of uncertainty than English users (consistent with Japanese linguistic norms around face-saving and epistemic humility). But across all languages, confident LLM outputs produce higher user reliance, and overconfident errors are systematically followed.

The mechanism: users are tracking confidence signals, not accuracy signals. Confidence is legible (it comes encoded in language through epistemic markers); accuracy requires independent verification. In the absence of real-time accuracy feedback, users default to confidence as a proxy for reliability. This is a rational heuristic in human-human interaction where confidence often tracks expertise. It is a dangerous heuristic in human-LLM interaction where confidence is a trained linguistic behavior decoupled from epistemic calibration.

This extends Why do language models fail confidently in specialized domains? (which focused on model calibration) to the user behavior level — showing the practical consequence of model overconfidence: systematic user overreliance regardless of linguistic context.

A specific instantiation of overreliance harm comes from AI fact-checking. In a preregistered RCT, AI-generated fact checks did not improve participants' overall ability to discern headline accuracy. Worse, when users opted in to view AI fact checks, they became significantly more likely to share both true and false news — but only more likely to believe false news. Self-selection into AI assistance correlated with increased vulnerability, not decreased. The opt-in users represent a population that actively seeks AI judgment, making them the most susceptible to the confidence-over-accuracy heuristic. See Does AI fact-checking actually help people spot misinformation?.

Fluency activates a folk model of attention. A related but distinct overreliance mechanism: linguistic fluency leads users to read the AI as paying attention to them. In human-human interaction, competent contextual uptake is evidence of attentional presence — a person who responds coherently to what you said has been listening. Users import this inference into AI interaction, treating fluent response as evidence that the system is oriented toward them. Since When should AI systems choose to stay silent? frames when-to-speak design, this fluency/attention conflation is upstream of that question: users do not perceive the AI as a silent partner needing design-imposed speech rules because they already read the fluent AI as attentive. This is distinct from confidence-overreliance — it is not the epistemic-marker signal producing overtrust, but the fluency-signal producing an attribution of attention the AI does not have.

The cross-linguistic finding matters for deployment: LLM overreliance cannot be attributed to English-language user characteristics or Western technology cultures. The risk is embedded in the structure of confident language use, which operates wherever language is used.

Rose-Frame provides a compounding mechanism for overreliance: it identifies three cognitive traps that interact multiplicatively. Overreliance is specifically Trap 2 (mistaking fluency for understanding), which compounds with Trap 1 (treating outputs as ontological facts rather than probabilistic maps) and Trap 3 (confirmation bias from sycophantic outputs that never challenge the user). When all three co-occur, the result is "epistemic drift" — not isolated misjudgments but runaway misinterpretation where each trap reinforces the others. See Why do people trust AI outputs they shouldn't?.

Source: Philosophy Subjectivity

Related concepts in this collection

Why do language models fail confidently in specialized domains? LLMs perform poorly on clinical and biomedical inference tasks while remaining overconfident in their wrong answers. Do standard benchmarks hide this fragility, and can prompting techniques fix it?
model calibration side of the same problem; this note adds the user-behavior consequence
Does any single persuasion technique work for everyone? Can fixed persuasion strategies like appeals to authority or social proof be reliably applied across different people and situations, or do they require adaptation to individual traits and context?
cross-linguistic reliance variability shows context-dependence; Japanese uncertainty reliance is a specific cultural modulation
What breaks when humans and AI models misunderstand each other? Explores whether misalignment in mutual theory of mind between humans and AI creates only communication problems or produces material consequences in autonomous action and collaboration.
overreliance on overconfident outputs is a specific MToM failure: users who don't interrogate the AI's model of them assume it's correct, and the AI's confident presentation prevents the trust-calibration loop that MToM requires
Do language models learn differently from good versus bad outcomes? Do LLMs update their beliefs asymmetrically when learning from their own choices versus observing others? This matters for understanding whether agentic AI systems might inherit human cognitive biases.
agent-side analog: models exhibit optimism bias for chosen actions while users exhibit overreliance on confident outputs — the same positive-signal bias operates at both the model decision level and the user trust level
Do users trust citations more when there are simply more of them? Explores whether citation quantity alone influences user trust in search-augmented LLM responses, independent of whether those citations actually support the claims being made.
domain-specific instance: citation count is a surface trust proxy just as confidence is; irrelevant citations (β=0.273) have nearly identical preference effect to relevant citations (β=0.285), confirming that users track quantity signals, not quality signals

Concept map

25 direct connections · 252 in 2-hop network ·dense cluster

Do users worldwide trust confident AI outputs ev… Why do language models fail confidently in special… Does any single persuasion technique work for ever… What breaks when humans and AI models misunderstan… Do language models learn differently from good ver… Do users trust citations more when there are simpl…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

users systematically overrely on overconfident llm outputs across all languages because confidence signals dominate accuracy tracking