Can opening politeness patterns predict whether conversations will turn hostile?
Do pragmatic politeness features in first exchanges—hedging, greetings, indirectness—reliably signal whether a conversation will later derail into personal attacks? Understanding early linguistic markers could help identify and prevent online hostility.
Conversations Gone Awry (Zhang et al. 2018) demonstrates that pragmatic devices used in the very first exchange of a conversation provide signal about whether it will subsequently go awry. This is not post-hoc detection of hostility — it is predictive, from civil-appearing openings, of future trajectory.
The framework draws on Brown and Levinson's politeness strategies. Positive politeness (gratitude, greetings, "please") encourages social connection; negative politeness (hedging, indirectness) dampens imposition. Both correlate with conversations staying on track. In contrast, direct questions and sentence-initial second-person pronouns ("You are...") correlate with eventual personal attacks — even when the opening content itself is not hostile.
The mechanism is pragmatic, not semantic. Content isn't aggressive; speech act structure predicts trajectory. "Why's there no mention of it?" (direct question) and "I don't think this source is reliable" (hedged assertion) may address the same concern, but the hedged version sustains civility more reliably. Directness signals latent hostility and reinforces forcefulness of contentious impositions.
A particularly interesting finding: in conversations that derail, both interlocutors exhibit directness markers — not just the eventual attacker. First replies in derailing conversations contain more second-person pronouns (pushing back), while on-track conversations have more first-person starts ("I/We"), indicating willingness to work together rather than argue against. The derailment is dyadic, not unilateral.
Quantitative support from Instagram hostility forecasting: A complementary study on Instagram demonstrates that early conversational signals predict hostile comment arrival 10+ hours in the future (AUC 0.82) and predict whether posts will receive more than 10 hostile comments (AUC 0.91). The predictive features — author's history of receiving hostile comments, user-directed profanity, number of distinct participating users, and hostility trends — overlap with but extend the politeness-strategy framework. Past hostility history and participant diversity add social context features beyond the linguistic politeness markers. Together, the two studies suggest that conversation derailment prediction benefits from combining pragmatic (speech act structure) and social (interaction history, network properties) signals.
This has design implications for conversational AI. Since Why can't conversational AI agents take the initiative?, AI systems cannot currently steer conversations away from derailment trajectories even if they could detect them. And since How can proactive agents avoid feeling intrusive to users?, the civility dimension of proactive design directly maps to deploying these pragmatic strategies.
The face-saving research provides the complementary mechanism: since Why do language models avoid correcting false user claims?, LLMs are trained to avoid the very directness that signals derailment, but for the wrong reason — not to sustain civility but to avoid any form of disagreement. The politeness strategy framework and the face-saving finding address opposite failure modes: politeness research shows that hedging and indirectness prevent derailment (strategic avoidance of face-threat), while face-saving research shows that LLMs avoid correction even when it is necessary (pathological avoidance of face-threat). The design challenge is enabling models to use pragmatic politeness strategically without falling into face-saving accommodation of false information.
Source: Conversation Topics Dialog
Related concepts in this collection
-
How can proactive agents avoid feeling intrusive to users?
Explores why proactive conversational agents often feel annoying rather than helpful, and what design dimensions could prevent them from violating user expectations and autonomy.
politeness strategies are the pragmatic mechanism underlying the civility dimension
-
Can language models adapt implicature to conversational context?
Do large language models flexibly modulate scalar implicatures based on information structure, face-threatening situations, and explicit instructions—as humans do? This tests whether pragmatic computation is truly context-sensitive or merely literal.
another pragmatic competence gap; politeness is context-dependent like scalar implicature
-
Why do speakers deliberately use ambiguous language?
Explores whether ambiguity is a linguistic defect or a strategic tool speakers use for efficiency, politeness, and deniability. Matters because it challenges how we train language systems.
hedging introduces productive ambiguity that maintains conversational options
-
Why do language models avoid correcting false user claims?
Explores whether LLM grounding failures stem from missing knowledge or from conversational dynamics. Examines whether models use face-saving strategies similar to humans when disagreement is needed.
complementary face-threat mechanism: politeness strategies show strategic hedging prevents derailment, while face-saving shows pathological avoidance of disagreement prevents correction; the design challenge is strategic vs. pathological face-management
-
Can conversation structure predict dialogue success better than content?
Does the geometric shape of how dialogue unfolds—timing, repetition, topic drift—matter as much as what people actually say? This explores whether interactive patterns hold signals hidden in word choice alone.
politeness predicts trajectory from pragmatic features in opening exchanges; TRACE predicts from continuous embedding geometry; two complementary signal types capturing the same phenomenon at different levels of abstraction
-
Can models learn to abstain when uncertain about predictions?
Explores whether language models can be trained to recognize when they lack sufficient information to forecast conversation outcomes, rather than forcing uncertain predictions into confident-sounding responses.
politeness features identify WHICH early signals predict derailment; calibrated forecasting provides HOW to quantify confidence in those predictions and when to abstain
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
pragmatic politeness strategies in opening exchanges predict whether conversations will derail into personal attacks