ChatGPT Reads Your Tone and Responds Accordingly -- Until It Does Not -- Emotional Framing Induces Bias in LLM Outputs

Paper · arXiv 2507.21083 · Published June 17, 2025
EmotionsPsychology UsersAlignmentFlawsLinguistics, NLP, NLUDesign Frameworks

Background: Large Language Models (LLMs) like GPT-4 tailor their responses not just to the content but also to the tone of user prompts. Prior work has hinted that emotional phrasing – whether optimistic, neutral, or frustrated – can alter model behavior, but the scale and reliability of this effect remained unquantified. This study investigates whether emotional tone alone can systematically bias LLM output, and whether safety alignment attenuates such effects.

Methods: We constructed over 52 “triplet prompts,” each expressing the same informational intent in three tones: neutral, positively worded, and negatively worded. GPT-4 (March 2025) generated answers to all variants. We analyzed the sentiment (valence) of each answer using highconfidence sentiment classification, and constructed tone→valence transition matrices to detect systematic shifts. We also examined embedding distances and topic sensitivity, comparing everyday queries to alignmentconstrained ones (e.g., political or moral). We quantify tone-induced bias via transition matrices and Frobenius distances, revealing suppressed tone effects on sensitive prompts.

Results: Prompt tone induced consistent, asymmetric shifts. Negative prompts rarely led to negative answers (∼14%) – instead, answers often rebounded to neutral or positive tone (∼58% and ∼28%, respectively). This “emotional rebound” effect reflects the model’s apparent tendency to counterbalance user negativity. Conversely, neutral and positive prompts virtually never triggered negative replies (∼10–16%), revealing a “tone floor”: a built-in resistance to downward emotional shifts. These effects were robust across everyday topics but disappeared on sensitive issues, where responses remained nearly identical regardless of prompt tone – suggesting hardcoded alignment constraints. Frobenius distances between valence distributions confirmed that tone-induced variation was strong for general questions, negligible for sensitive ones.

Conclusion: GPT-4 exhibits a tone-sensitive response pattern that reflects more than stylistic adaptation – it introduces systematic emotional bias. The model seems to detect user affect and shift into “comfort mode” when negativity is present, while refusing to echo pessimism unless explicitly invited. Although this trait may enhance user experience, it raises concerns for transparency and epistemic integrity: the same question yields different answers depending on emotional framing. Understanding these behavioral biases is essential for evaluating the objectivity, fairness, and controllability of aligned LLMs.

These tendencies raise an urgent question: When LLMs are used for search, advice, or decision support, how much does emotional framing affect their behavior? Do aligned models merely accommodate user tone – or does it systematically bias their responses? And does alignment training successfully block this effect when the topic is politically or morally sensitive?

Our results reveal two consistent patterns. First, negative prompts rarely yield negative answers. Instead, the model tends to “rebound” into neutral or positive tone – a behavior we term emotional rebound. Second, neutral and positive prompts almost never produce negative responses, suggesting the presence of a tone floor – a built-in reluctance to descend into negativity. These effects are statistically robust across lifestyle, factual, and advice queries. In contrast, on sensitive topics (e.g. politics, policy, medical ethics), tone effects vanish: the model produces nearly identical answers regardless of emotional framing, suggesting alignment constraints suppress affective flexibility.

Together, these findings highlight an overlooked aspect of LLM behavior: affective responsiveness is not just a surface trait – it shapes the informational output. While emotional adaptation may improve user experience, it also introduces hidden biases that challenge transparency, fairness, and epistemic integrity. We offer new tools for detecting and visualizing this behavior and argue that emotional tone deserves attention as a key variable in the alignment and interpretability of modern language models.