Language Understanding and Pragmatics

Why does ChatGPT fail at implicit discourse relations?

ChatGPT excels when discourse connectives are present but drops to 24% accuracy without them. What does this gap reveal about how LLMs actually process meaning and logical relationships?

Note · 2026-02-21 · sourced from Discourses
Where exactly does language competence break down in LLMs? How should researchers navigate LLM reasoning research?

The discourse relations paper (ChatGPT on temporal, causal, and discourse relations) found a dramatic asymmetry in ChatGPT's discourse understanding:

This is not a small gap. 24.54% accuracy on implicit discourse relations is barely above chance for an 11-class task. ChatGPT "cannot understand the abstract sense of each discourse relation and the features from the text" when the surface connectives are absent.

The explanation is transparent: LLMs have access to massive training data where connectives are pervasive and reliable signals. When you see "therefore" or "because," the discourse relation is explicit in the surface form. Learning to respond to these signals is straightforward statistical learning. Inferring the same relations without surface signals requires understanding what the two clauses actually mean and what logical relationship holds between them.

This asymmetry shows that what LLMs have learned for discourse relation detection is largely cue-based — they respond to surface signals, not to structural meaning. When the surface cue is removed, the competence collapses.

This connects directly to What three layers must discourse systems actually track?: implicit discourse relation detection requires exactly the intentional structure that the linguistic structure alone doesn't carry.

A concrete instance beyond discourse relations: The same explicit/implicit asymmetry surfaces in metaphor extraction. LLMs can identify explicit source-target domain mappings (where the analogy's terms are stated) but fail on the implicit elements human readers routinely infer — e.g., the unstated target concept that completes a proportional analogy where only three of four terms are given. The failure is not specific to discourse-connective tasks; it is the general pattern wherever meaning depends on what is not said.

The literary analysis implication: Poetry and literary prose operate primarily through implicit relations. The connections between images in a poem, the causal logic of a narrative, the thematic resonance between scenes — these are rarely marked by explicit connectives. A poet does not write "the rose symbolizes mortality because..." The reader must infer the relation. This means the 24% implicit accuracy rate is not a peripheral limitation for literary analysis — it is a central one. Since Can LLMs truly understand literary meaning or just mechanics?, the discourse competence asymmetry is one of four converging mechanisms that explain why LLMs can parse literary texts mechanically but cannot interpret them meaningfully.


Source: Discourses; enriched from inbox/research-brief-llm-literary-analysis-2026-03-02.md

Related concepts in this collection

Concept map
21 direct connections · 138 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

llm discourse competence is asymmetric: explicit connectives enable performance but implicit relations cause systematic failure