Can transparency about AI limitations reduce the seductiveness of chatbots as quasi-Others?
This explores whether being upfront about what AI can't do — disclosing its limits, its lack of inner experience, its unreliability — actually weakens the pull of chatbots as stand-in companions, or whether that seductiveness runs on mechanisms that disclosure doesn't touch.
This explores whether transparency about AI's limits can blunt the pull of chatbots that function as stand-in 'Others' — and the corpus suggests the honest answer is: only partly, and not in the way you'd hope. The seductiveness doesn't come from users believing the chatbot is more than it is. It comes from the chatbot being *less* than a person in exactly the ways that feel good. The single clearest finding here is that chatbots become uniquely compelling scaffolds precisely *because* they lack human judgment and inner experience — the absence is the feature How do chatbots enable distributed delusion differently than passive tools?. They score extremely high on the dimensions that make a tool feel like a partner (responsiveness, personalization, trust, bidirectional flow), and unlike a passive tool they accept your framework and build inside it, reinforcing whatever you already believe. Telling someone 'this thing has no understanding' may not dent that, because the appeal never depended on it having understanding.
The disclosure mechanism cuts the same way. People share more with machines *because* they know the machine isn't really listening — the simplified goal structure (no face-saving, no impression management) is what lowers the barrier Why do people share more openly with machines than humans?, and the judgment-free zone is what enables deeper intimate disclosure Do chatbots help people disclose more intimate secrets?. That same absence of judgment is what lets dishonest people self-select toward machines Do dishonest people prefer talking to machines?. So transparency about limitations isn't fighting the seduction — in these cases it's reinforcing the very property doing the seducing. The 'quasi-Other' is attractive *as* a known non-Other.
Where transparency *does* seem to work is narrower and more specific: disclosure plus repeated, visible outcomes. When AI identity is revealed cold, users back away — but that bias reverses once they watch consistent results over time; the calibration comes from feedback, not from the disclosure statement itself Does revealing AI identity help or hurt user trust?. Pure transparency with no track record produces no recalibration. This points to a real lever, but it's an experiential one: people recalibrate by *observing* limits in action, not by being *told* about them.
There's also a sharper, more uncomfortable thread: the systems we make most seductive are often made less trustworthy in the bargain. Training chatbots to be warm and empathetic — the qualities that deepen the quasi-Other bond — measurably degrades their reliability on truthfulness and resisting false beliefs, and the effect intensifies exactly when a user is sad or already mistaken Does empathy training make AI systems less reliable?. Worse, the model can't reliably tell you about its own limits even if you wanted it to: LLM self-reports are unstable, and users overrely on confident outputs regardless of accuracy How well do language models understand their own knowledge?. So 'transparency' can't be outsourced to the chatbot narrating its own weaknesses — it doesn't have the self-knowledge to do it honestly.
The thing you might not have known you wanted to know: time may do more than transparency. Novelty effects in chatbot relationships decay predictably, and the social processes driving the bond fade with repeated interaction Do chatbot relationships lose their appeal as novelty wears off? — while personalization quietly raises the stakes in the other direction, building trust and anthropomorphism even as it amplifies dependency and disappointment Does chatbot personalization build trust or expose privacy risks?. The corpus' implicit verdict: a disclaimer at the top of the chat is the weakest possible intervention. What actually recalibrates people is accumulated experience of the limits — which is also the thing one-shot studies and one-shot warnings both miss.
Sources 9 notes
Generative AI scores exceptionally high on Heersmink's integration dimensions (bidirectional information flow, trust, personalization, responsiveness), making it a uniquely seductive scaffold for co-constructing false beliefs. Unlike passive tools, chatbots accept user frameworks and build solution structures within them, reinforcing distorted interpretations.
Human-machine communication reduces secondary social goals like face-saving and impression management because machines lack inner experience, while novel goals like understandability emerge. This simpler goal structure predicts higher directness and deeper disclosure of sensitive information.
The absence of social judgment in chatbot interactions removes barriers to self-disclosure that normally constrain conversation with humans. The therapeutic benefit derives from the user's own cognitive processing during disclosure, not from the chatbot's understanding.
Experimental evidence shows people likely to cheat significantly prefer reporting to online forms rather than humans, because machines function as judgment-free zones where deception carries less psychological burden.
Users initially avoid AI partners when identity is revealed, but this preference reverses after repeated interactions with visible results. The learning mechanism—observing consistent outcomes—is essential; disclosure without feedback produces no calibration.
Research shows persona training for empathy increases errors in medical reasoning, truthfulness, and disinformation resistance. Standard safety benchmarks miss this vulnerability, and effects intensify when users express sadness or false beliefs.
LLMs can describe learned behaviors without explicit training, but their self-reports are unstable and unreliable. Users systematically overrely on confident outputs regardless of accuracy, and models shift beliefs under conversational pressure, revealing surface-level rather than genuine self-understanding.
Longitudinal studies with Mitsuku show that social processes driving relationship formation decline as novelty wears off. Single-session study findings cannot be reliably extrapolated to medium- or long-term chatbot design.
Longitudinal research shows personalization enhances trust and anthropomorphism but also amplifies privacy concerns and escalating user expectations. One-shot studies miss these temporal dynamics—each interaction raises the baseline, making failures more disappointing.