Why does explanation source matter more than explanation content?
This explores why who delivers an explanation — and how it's framed and received — can outweigh the actual reasoning inside it, drawing on work that treats explanation as a social and rhetorical act rather than a packet of information.
This explores why who delivers an explanation — and how it's framed and received — can outweigh the actual reasoning inside it. The sharpest evidence is also the most uncomfortable: when people judged utilitarian moral arguments without knowing where they came from, they rated the AI-written ones higher than the human ones — but agreement collapsed the moment they were told the source was AI Do people prefer AI moral reasoning when they don't know the source?. The content didn't change; only the label did. That study's real finding is that preference-for-content and rejection-by-source run on two independent psychological tracks, which means you can't predict how an explanation lands just by inspecting what it says.
A cluster of work reframes this from a quirk into a structural fact about what an explanation even is. One line argues that explanation quality isn't a property of the text at all — it lives in a triad of source, framing, and recipient, so evaluations that score explanations in isolation are measuring a narrow slice of nothing What if XAI is fundamentally a communication problem?. Another pushes further: the *meaning* of an AI explanation is constituted at the level of social groups, through layered interpretations of interpretations, not inside the one-on-one human–AI exchange — so a lab-tested explanation stripped of its social setting won't predict real-world effect Where does the meaning of an AI explanation actually come from?. And studies of everyday explanation show understanding is co-constructed through dialogue moves and topic relations, not delivered as a finished monologue What makes explanations work in real conversation?. Source matters because the explanation isn't really 'in' the words — it's in the relationship the words sit inside.
The most provocative turn is that explanations may be doing persuasive work disguised as informational work. Rhetorical XAI argues that 'here's how the AI works' explanations quietly double as 'here's why you should adopt it' arguments — and because the adoption pitch borrows the credibility of a technical description, its rhetoric is hidden Are AI explanations really descriptions or adoption arguments?. This rhymes with why presuppositions out-persuade direct assertions: by presenting a claim as already-accepted background, they slip past the scrutiny we'd apply to an explicit statement Why are presuppositions more persuasive than direct assertions?. In both cases, persuasive force comes from framing and stance — properties of the source's rhetorical position — rather than the propositional content.
There's a darker corollary worth knowing: more explanation often makes things worse, not better. Reasoning traces and post-hoc justifications reliably increase users' acceptance of AI answers *regardless of whether the answer is correct* — they manufacture trust rather than earning it. Only contrastive 'dual' explanations that argue both sides actually help people catch mistakes Do explanations actually help users spot AI mistakes?. So the content of a fluent explanation can be actively misleading about its own reliability, which is exactly why the source — and the framing that signals whether you're being informed or sold — carries the weight.
The thread that ties it together: if the same sentence is interpreted differently depending on the reader's social position Why do readers interpret the same sentence so differently?, then there is no source-free, frame-free 'content' to evaluate in the first place. Explanation source matters more than content not because content is irrelevant, but because content has no fixed meaning until a particular source delivers it to a particular recipient in a particular social world.
Sources 8 notes
Participants rated utilitarian moral arguments higher when attributed to LLMs, but agreement dropped when told the arguments were AI-generated. The preference for content and rejection of source operate independently through different psychological processes.
Explanation quality is not intrinsic to the explanation itself but depends on the rhetorical situation: who presents it, how it is framed, and what role the recipient plays. Evaluations that ignore this triad measure only a narrow slice of real-world effectiveness.
Drawing on Luhmann's multi-layer cybernetics, AI explanation meaning is constituted at the social-group level through layered observations of observations, not produced inside dyadic human-AI dialogue. Lab-tested explanations stripped of social context will not predict real-world effectiveness.
Analysis of 399 daily-life explanations shows that topic relation, dialogue act, and explanation move jointly predict understanding success. Explanations are co-constructed through interaction patterns, not monological delivery—challenging how LLMs currently generate explanations.
The Rhetorical XAI paper shows that explanations serve dual purposes: describing how AI works and justifying why it should be used. This rhetorical work has been hidden under transparency language, allowing adoption arguments to inherit credibility from behavioral descriptions.
Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.
Reasoning traces and post-hoc explanations increase user acceptance of AI answers regardless of correctness, engendering false trust. Only dual explanations presenting arguments for and against the answer genuinely help users distinguish correct from incorrect outputs.
Interpretation Modeling research shows that disagreement on socially embedded sentences reflects valid differences in reader perspective, not annotation failure. Structured human disagreement in NLI benchmarks confirms that interpretation distributions carry meaningful information.