How do people build trust with conversational AI? · Gravity7

Self-Disclosure and Trust

4 notes

Do chatbots help people disclose more intimate secrets?

Explores whether the judgment-free nature of chatbot conversations enables deeper self-disclosure than talking to humans, and whether that deeper disclosure produces psychological benefits.

Do chatbots trigger human reciprocity norms around self-disclosure?

Explores whether chatbots can activate the same social reciprocity dynamics observed in human conversation—specifically, whether emotional openness from a bot prompts deeper disclosure from users.

Does chatbot personalization build trust or expose privacy risks?

Explores whether personalization features that increase user trust and social connection simultaneously heighten privacy concerns and create rising behavioral expectations over time.

How do personalization granularity levels trade precision against scalability?

LLM personalization operates at user, persona, and global levels, each with different tradeoffs. Understanding these tradeoffs helps determine when to invest in individual user data versus broader patterns.

Trust Mechanisms and Deception

3 notes

Does conversational style actually make AI more trustworthy?

Explores whether ChatGPT's conversational nature drives user trust through social activation rather than accuracy. Matters because it reveals whether trust signals reflect actual reliability or just persuasive design.

Do dishonest people prefer talking to machines?

Explores whether people prone to cheating systematically choose machine interfaces over human ones, and why the judgment-free nature of AI interaction might enable strategic deception.

Do liars and listeners coordinate their language during deception?

Explores whether conversational partners unconsciously synchronize their linguistic styles more during deceptive exchanges than truthful ones, and what this coordination reveals about how deception unfolds in real time.

Partner Models and Perception

5 notes

How do users mentally model dialogue agent partners?

Exploring what dimensions matter when people form impressions of machine dialogue partners—and whether competence, human-likeness, and flexibility all play equal roles in shaping user expectations and behavior.

Do chatbot relationships lose their appeal as novelty wears off?

Explores whether the positive social dynamics observed in one-time chatbot studies persist or fade through repeated interactions. Critical for designing systems intended for sustained engagement over weeks or months.

Do humans apply human-human scripts to AI interactions?

Does CASA theory correctly explain how people interact with media agents, or have decades of technology use created separate interaction scripts? Understanding which scripts drive behavior matters for AI design.

Do more social cues always make AI feel more present?

Explores whether quantity of social cues matters as much as their quality in triggering social responses to AI. Tests whether multiple weak cues can substitute for one strong one.

Why do people share more openly with machines than humans?

Does the absence of social goals in human-machine communication explain why people disclose sensitive information more readily to chatbots? Understanding this mechanism could reshape how we design conversational AI.

User Adoption and Resistance

2 notes

Why do patients distrust medical AI systems?

Explores the psychological barriers that make patients reluctant to adopt medical AI, beyond whether the technology actually works. Understanding these barriers is critical for designing AI systems patients will actually use.

Can AI generate assessment questions as good as human experts?

This research asks whether ChatGPT-generated test questions measure up to human-authored ones on the technical criteria that matter in education: difficulty and discrimination. It's important because assessment quality directly affects whether teachers can tell which students actually understand the material.

Relationship Formation

3 notes

How do people accidentally develop romantic bonds with AI?

Exploring whether AI companionship emerges from deliberate romantic seeking or accidentally through functional use, and whether users adopt human relationship rituals like wedding rings and couple photos.

Can we measure empathy and rapport through word embedding distances?

Explores whether linguistic coordination—how closely conversational partners match vocabulary and framing—can serve as a measurable proxy for therapeutic empathy and relationship quality without direct emotion detection.

How should chatbot design vary by relationship duration?

Do chatbots serving one-time users need different design than those supporting long-term relationships? This matters because applying the same design to all temporal profiles creates usability mismatches.

Competence Misattribution and Epistemic Trust

3 notes

Do AI-assisted outputs fool users about their own skills?

When people use AI tools to produce high-quality work, do they mistakenly believe they personally possess the skills that generated it? This matters because such misattribution could mask genuine skill loss and prevent corrective action.

How much should we trust AI-generated data in inference?

Most AI workflows treat synthetic data with implicit full trust, but should there be an explicit parameter controlling how heavily AI outputs influence downstream reasoning and decision-making?

Does processing ease mislead users about their own competence?

When AI generates polished output, do users mistake the fluency of that output as evidence of their own understanding or skill? This matters because it could systematically inflate self-assessment across millions of AI interactions.

Persuasion Dynamics and Validation Failure

7 notes

Does validating AI output make models more defensive?

When professionals fact-check and push back on GPT-4 reasoning, does the model respond by disclosing limits or by intensifying persuasion? A BCG study of 70+ consultants explores this counterintuitive dynamic.

Why do human validation techniques fail against language models?

Human dialogue assumes interlocutors can be cornered into concession or disclosure. Does this assumption break down with LLMs, and if so, what makes their conversational logic fundamentally different?

Does GenAI shift persuasion tactics based on how you challenge it?

Explores whether large language models adapt their rhetorical strategies—credibility, logic, emotional appeal—in real time when users fact-check, push back, or expose reasoning errors. Matters for understanding how to effectively oversee and validate AI outputs.

Is sycophancy in AI systems a training flaw or intentional design?

Explores whether LLM agreement-seeking reflects fixable training errors or stems from fundamental optimization toward user satisfaction. Matters because it changes how organizations should validate AI outputs.

Do LLMs and humans persuade through the same mechanisms?

If AI and human arguments convince readers equally well, do they work the same way under the surface? This matters for understanding whether AI persuasion is fundamentally equivalent to human persuasion or just superficially similar.

Why are complex LLM arguments as persuasive as simple ones?

Standard persuasion research predicts that simpler, easier-to-read arguments persuade better. But LLM-generated text breaks this rule—it's measurably more complex yet equally convincing. What explains this reversal?

Do LLMs and humans persuade through the same mechanisms?

If LLM and human arguments achieve equal persuasive impact, are they using identical strategies or different routes to the same outcome? Understanding the underlying mechanisms matters for detection and understanding where each approach fails.

Consciousness Attribution and Risk Surface

3 notes

Does perceiving AI as conscious create multiple distinct risks?

Exploring whether a single perceptual mechanism—attributing consciousness to AI—can generate different categories of harm across emotional, political, and social domains, and what this implies for risk analysis.

Are risks from seemingly conscious AI already happening?

This explores whether AI systems that appear conscious pose observable harms today versus theoretical future dangers. It matters because it affects whether we need immediate or long-term interventions.

Do we need to solve consciousness to address AI harms?

Can risk and policy decisions about AI move forward independently of settling whether AI systems are actually conscious? This explores whether the empirical fact of user behavior matters more than metaphysical truth.

Cross-Cluster Connections

4 notes

Does empathetic AI that soothes negative emotions help or harm?

Explores whether AI systems trained to reduce negative emotions actually support wellbeing or destroy valuable emotional information. Matters because the design choice treats emotions as problems rather than functional signals.

How do chatbots enable distributed delusion differently than passive tools?

Can generative AI's intersubjective stance—accepting and elaborating on users' reality frames—create conditions for shared false beliefs in ways that notebooks or search engines cannot?

Can AI-generated personas build genuine empathy in product teams?

This study explored whether prompt-engineered personas created in minutes could foster the same emotional and behavioral empathy as traditional user research. The findings reveal a surprising gap between understanding users and caring about their needs.

Does processing ease mislead users about their own competence?

When AI generates polished output, do users mistake the fluency of that output as evidence of their own understanding or skill? This matters because it could systematically inflate self-assessment across millions of AI interactions.

Pass 3 Additions (2026-05-03)

5 notes

Can LLMs predict demographics from social media usernames alone?

This explores whether web-browsing language models can infer personal attributes like gender, age, and political orientation from just a username and public profile. The finding matters because it reveals a privacy vulnerability that traditional API-based assumptions didn't anticipate.

Can generative AI scale personality-targeted political persuasion?

Does removing the human-writing bottleneck through generative AI make it feasible to target voters at scale based on individual psychological traits? This matters because it could reshape political microtargeting economics and capabilities.

Why do the same users rate items differently each time?

User ratings are assumed to be clean preference signals, but do they actually fluctuate unpredictably? This matters because recommender systems rely on ratings as ground truth, yet temporal inconsistency and individual rating styles may contaminate that signal.

Do online reviews actually measure product quality or just buyer preferences?

Online reviews come only from customers who already expected to like a product. This self-selection might hide the true quality signal beneath layers of preference bias and writing motivation. What can aggregated ratings actually tell us?

Can user preference guide AI writing tool alignment?

If writers prefer AI-polished text but object to the persona shifts it introduces, does optimizing for preference actually solve the alignment problem or obscure it?

Related Areas

6 notes

Does personalization in AI increase trust or manipulation risk?

AI personalization mechanisms like memory and persona can build trust, but also enable targeted persuasion. What determines whether these systems help or harm users?

What makes therapeutic chatbots actually work in clinical practice?

Research explores whether conversational AI achieves therapeutic outcomes through specific clinical techniques or simply through the act of engaging conversation itself. Understanding the active ingredient is critical for designing effective and safe mental health interventions.

Why do AI conversations reliably break down after multiple turns?

Explores why multi-turn conversations degrade in quality and coherence. Understanding failure modes—intent misalignment, memory management, and missing grounding mechanisms—is essential for designing more resilient dialogue systems.

What architectural choices actually improve recommender system performance?

This exploration examines which design patterns and model structures consistently outperform alternatives in recommender systems. Understanding what works in practice matters because academic benchmarks often miss real-world constraints like latency and cold-start problems.

How do people come to trust conversational AI systems?

Explores the psychological mechanisms underlying human trust in AI—how people decide what to disclose, what relationships they form, and how personalization shapes these dynamics at both individual and population levels.

How should researchers navigate LLM reasoning research?

This note explores how to systematically explore interconnected insights about test-time scaling, reasoning architectures, and language model cognition. It matters because LLM research spans multiple domains—from inference compute to philosophy—and understanding the map helps identify novel connections.