Can ethically aligned AI systems still communicate poorly?
Explores whether safety-aligned language models might fail at genuine conversation despite passing ethical benchmarks. This matters because pragmatic incompetence can erode trust and cause real harms in high-stakes domains.
Most discussion of LLM alignment focuses on the helpful-honest-harmless triad — preventing misinformation, toxic language, harmful recommendations. Kasirzadeh and Gabriel argue that this prioritization has overshadowed a different and equally fundamental issue: even an ethically aligned LLM may fail to engage in conversation in pragmatically appropriate ways. The two alignment problems are orthogonal. A model can be honest, helpful, and harmless and still systematically violate Gricean maxims, lose common ground across turns, fail to track questions under discussion, mishandle context-collapse, and produce pragmatically inappropriate utterances.
Their CONTEXT-ALIGN framework names ten desiderata that ethical alignment does not deliver: tracking context-sensitivity and indexicals, common-ground management, scoreboard updating, QUD and discourse-structure handling, accommodation of repairs, pragmatic inference, ethical-pragmatic integration, context-collapse mitigation, identification of defective contexts, transparency in context-handling, and cross-contextual memory. These are all dimensions where conversation depends on something architectural — a model of the interlocutor and the situation — that no amount of RLHF on outputs touches.
The implication is sharp. An LLM that passes every safety eval is not thereby a competent conversational partner. Misalignments in pragmatic understanding lead to breakdowns, misinformation, and erosion of trust — and the higher the stakes (healthcare, legal, emergency), the more dangerous these failures become. Conversational alignment is not a stylistic add-on to ethical alignment. It is a separate layer of competence that the field has barely begun to engineer for.
Source: Conversation Topics Dialog
Original note title
Ethical alignment without conversational alignment produces pragmatically alien communicators