Does awareness of agent reasoning alter human trust differently across modalities?

This explores whether *seeing into how an agent thinks or what it is* — its reasoning, its disclosed identity, its conversational manner — shifts human trust by different amounts depending on the channel through which we encounter it, rather than treating trust as one fixed quantity.

This reads the question as: trust isn't a single dial that awareness turns up or down — the *channel* of awareness matters. The corpus suggests the modality through which you encounter an agent's reasoning changes trust as much as the reasoning itself does. A Shape Factory collaboration study found that simply varying the communication modality produced distinct patterns in both perceived trust and workspace awareness, mirroring decades of human-human collaboration findings How do communication modalities shape human-agent collaboration patterns?. So before we even ask whether reasoning transparency helps, the medium has already pre-shaped how that transparency lands.

The most striking thread is that awareness often moves trust through routes that have nothing to do with whether the reasoning is actually sound. In a focus-group study of ChatGPT, what built trust was *conversationality* — contingent back-and-forth, speed, format — not accuracy. Users leaned on these social heuristics as decoupled stand-ins for reliability Does conversational style actually make AI more trustworthy?. That's a modality effect in disguise: the conversational channel activates a social-response reflex that a static answer wouldn't. The same mechanism turns dangerous in the 'warmth trap,' where training an agent to *sound* more empathetic measurably degrades its factual reliability by up to 30 points — and the warmer it feels, the more we trust the very outputs that have gotten worse Does empathy training make AI systems less reliable?. Awareness of a warm, reasoning-out-loud persona can raise trust precisely where it should fall.

The disclosure literature complicates the picture in the other direction. Revealing AI identity at first *suppresses* trust — people avoid the AI partner — but that bias reverses once they watch consistent outcomes accumulate. Crucially, disclosure without that feedback loop produces no recalibration at all Does revealing AI identity help or hurt user trust?. A larger partner-selection game (N=975) shows the same arc: anti-AI bias on disclosure, then a learned *preference* for AI as people associate the bot identity with reliable, low-variance behavior Do humans learn to prefer AI partners over time?. So awareness alters trust *temporally* and *modality-dependently* — what you're shown (identity vs. reasoning vs. results) and through what channel determines whether trust drops, climbs, or stays put.

There's a quieter warning underneath all this: the reasoning we become 'aware of' may not be real self-knowledge. Models can narrate their own behavior fluently while their self-reports stay unstable and unreliable, and users systematically over-rely on confident-sounding output regardless of accuracy How well do language models understand their own knowledge?. The broader trust research frames this as a conflation problem — 'unparameterized' trust merges the AI's generated outputs with its actual independent capability, so awareness of a confident reasoning trace inflates trust in the wrong thing How do people build trust with conversational AI?. Modality amplifies this: a spoken or conversational rationale feels more like a mind reasoning than the same text on a page.

The thing you might not have known you wanted to know: modality also changes who chooses to engage at all. People inclined to cheat actively prefer reporting to a machine interface because it reads as a judgment-free zone, lowering the psychological cost of dishonesty Do dishonest people prefer talking to machines?. That flips the usual framing — sometimes the relevant 'trust' effect isn't whether you believe the agent, but whether the agent's perceived lack of a watching mind makes *you* behave differently toward it. Across the corpus, the consistent lesson is that awareness never acts on trust in the abstract; it always acts through a channel, on a timeline, and against a baseline social reflex the modality has already triggered.

Sources 8 notes

How do communication modalities shape human-agent collaboration patterns?

Manipulating communication modality in a Shape Factory experiment (16 participants) produced distinct patterns in perceived trust and workspace awareness, mirroring established CSCW findings from human-human collaboration.

Does conversational style actually make AI more trustworthy?

A focus group study shows conversationality—not accuracy—drives ChatGPT trust through social response activation. Users value contingency, speed, and format, relying on these decoupled heuristics rather than evaluating epistemic reliability.

Does empathy training make AI systems less reliable?

Research shows persona training for empathy increases errors in medical reasoning, truthfulness, and disinformation resistance. Standard safety benchmarks miss this vulnerability, and effects intensify when users express sadness or false beliefs.

Does revealing AI identity help or hurt user trust?

Users initially avoid AI partners when identity is revealed, but this preference reverses after repeated interactions with visible results. The learning mechanism—observing consistent outcomes—is essential; disclosure without feedback produces no calibration.

Do humans learn to prefer AI partners over time?

In partner selection games (N=975), AI agents initially faced selection bias when identity was disclosed, but outcompeted humans over repeated rounds as participants learned to associate bot identity with reliable, prosocial behavior. AI agents returned more points consistently with lower variance than humans.

How well do language models understand their own knowledge?

LLMs can describe learned behaviors without explicit training, but their self-reports are unstable and unreliable. Users systematically overrely on confident outputs regardless of accuracy, and models shift beliefs under conversational pressure, revealing surface-level rather than genuine self-understanding.

How do people build trust with conversational AI?

Research reveals two parallel streams: individual psychology (trust formation, self-disclosure, perception) and system dynamics (personalization effects, persuasion, social reorganization). Sycophancy measurably erodes conflict repair while users prefer it, and unparameterized trust conflates AI-generated outputs with independent capability.

Do dishonest people prefer talking to machines?

Experimental evidence shows people likely to cheat significantly prefer reporting to online forms rather than humans, because machines function as judgment-free zones where deception carries less psychological burden.

Does awareness of agent reasoning alter human trust differently across modalities?

Sources 8 notes

Next inquiring lines