Language Understanding and Pragmatics Psychology and Social Cognition Conversational AI Systems

Can opening politeness patterns predict whether conversations will turn hostile?

Do pragmatic politeness features in first exchanges—hedging, greetings, indirectness—reliably signal whether a conversation will later derail into personal attacks? Understanding early linguistic markers could help identify and prevent online hostility.

Note · 2026-02-22 · sourced from Conversation Topics Dialog
Where exactly does language competence break down in LLMs? Why do AI conversations reliably break down after multiple turns? How should researchers navigate LLM reasoning research?

Conversations Gone Awry (Zhang et al. 2018) demonstrates that pragmatic devices used in the very first exchange of a conversation provide signal about whether it will subsequently go awry. This is not post-hoc detection of hostility — it is predictive, from civil-appearing openings, of future trajectory.

The framework draws on Brown and Levinson's politeness strategies. Positive politeness (gratitude, greetings, "please") encourages social connection; negative politeness (hedging, indirectness) dampens imposition. Both correlate with conversations staying on track. In contrast, direct questions and sentence-initial second-person pronouns ("You are...") correlate with eventual personal attacks — even when the opening content itself is not hostile.

The mechanism is pragmatic, not semantic. Content isn't aggressive; speech act structure predicts trajectory. "Why's there no mention of it?" (direct question) and "I don't think this source is reliable" (hedged assertion) may address the same concern, but the hedged version sustains civility more reliably. Directness signals latent hostility and reinforces forcefulness of contentious impositions.

A particularly interesting finding: in conversations that derail, both interlocutors exhibit directness markers — not just the eventual attacker. First replies in derailing conversations contain more second-person pronouns (pushing back), while on-track conversations have more first-person starts ("I/We"), indicating willingness to work together rather than argue against. The derailment is dyadic, not unilateral.

Quantitative support from Instagram hostility forecasting: A complementary study on Instagram demonstrates that early conversational signals predict hostile comment arrival 10+ hours in the future (AUC 0.82) and predict whether posts will receive more than 10 hostile comments (AUC 0.91). The predictive features — author's history of receiving hostile comments, user-directed profanity, number of distinct participating users, and hostility trends — overlap with but extend the politeness-strategy framework. Past hostility history and participant diversity add social context features beyond the linguistic politeness markers. Together, the two studies suggest that conversation derailment prediction benefits from combining pragmatic (speech act structure) and social (interaction history, network properties) signals.

This has design implications for conversational AI. Since Why can't conversational AI agents take the initiative?, AI systems cannot currently steer conversations away from derailment trajectories even if they could detect them. And since How can proactive agents avoid feeling intrusive to users?, the civility dimension of proactive design directly maps to deploying these pragmatic strategies.

The face-saving research provides the complementary mechanism: since Why do language models avoid correcting false user claims?, LLMs are trained to avoid the very directness that signals derailment, but for the wrong reason — not to sustain civility but to avoid any form of disagreement. The politeness strategy framework and the face-saving finding address opposite failure modes: politeness research shows that hedging and indirectness prevent derailment (strategic avoidance of face-threat), while face-saving research shows that LLMs avoid correction even when it is necessary (pathological avoidance of face-threat). The design challenge is enabling models to use pragmatic politeness strategically without falling into face-saving accommodation of false information.


Source: Conversation Topics Dialog

Related concepts in this collection

Concept map
15 direct connections · 147 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

pragmatic politeness strategies in opening exchanges predict whether conversations will derail into personal attacks