What role does entity salience play in detecting incoherence?

This explores how tracking the entities a text commits to — who or what it's about — helps catch when that text stops cohering, rather than detecting incoherence at the surface word level.

This explores how tracking the entities a text commits to — who or what it's about — helps catch when that text stops cohering, rather than detecting incoherence at the surface word level. The corpus doesn't have a paper labeled "entity salience" outright, but it has a strong adjacent thread: the most direct hit is work showing dialogue coherence fails in four distinct semantic modes, two of which are squarely about entities — contradiction and coreference inconsistency — and that these are only caught by Abstract Meaning Representation, which makes who-refers-to-whom explicit, where text-level manipulations alone miss them What semantic failures break dialogue coherence most realistically?. That's the core answer: incoherence often hides in the entity graph, not the wording, so a detector that doesn't track entities is blind to a whole failure class.

There's a deeper reason entity tracking is hard for these models, and it's worth knowing. The 20-questions regeneration test shows that an LLM doesn't actually commit to a single character or object — it holds a superposition and samples one at generation time, so regenerating the same prompt yields different-but-locally-consistent entities Do large language models actually commit to a single character?. If the model never firmly fixes an entity, coreference drift isn't a bug at the edges; it's baked into how generation works. That reframes "detecting incoherence" as detecting when the sampled entity has quietly shifted underneath consistent-looking prose.

The corpus also points to two contrasting ways to *find* that drift. One is meaning-level: semantic entropy clusters multiple sampled answers by whether they entail each other and flags divergence — a way of noticing when the "same" question produces semantically different commitments Can we detect when language models confabulate?. The other is structural and explicitly entity-aware: a learned verifier operating on full token-to-token similarity maps reliably rejects "structural near-misses" — things that look topically right but don't actually match — precisely because it reads the fine-grained interaction pattern rather than a compressed summary vector Can verification separate structural near-misses from topical matches?. Both say the same thing from different angles: salient entities are where coherence is won or lost, and you need a representation that keeps them visible.

Two more notes round out why this matters. Models routinely fail to integrate context when strong training-time associations override the entities actually present in the prompt — incoherence driven by the wrong entity being "loud" in parametric memory Why do language models ignore information in their context? — and they fail badly at holding multiple valid interpretations of ambiguous text at once Can language models recognize when text is deliberately ambiguous?. Put together, the picture you didn't know you wanted: "detecting incoherence" is less about catching contradictions in sentences and more about whether a system can keep track of which entities it has actually committed to — and the corpus suggests the surface text is the last place that breakdown shows up.

Sources 6 notes

What semantic failures break dialogue coherence most realistically?

Research using Abstract Meaning Representation identified four distinct incoherence types: contradiction, coreference inconsistency, irrelevancy, and decreased engagement. AMR-trained classifiers detect these semantic failures while text-level manipulations alone cannot.

Do large language models actually commit to a single character?

Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.

Can we detect when language models confabulate?

Clustering sampled answers by bidirectional entailment and computing entropy over semantic clusters catches confabulations invisible at token level. This self-referential approach works across tasks without task-specific training data.

Can verification separate structural near-misses from topical matches?

A two-stage pipeline—pooled-cosine recall followed by a small Transformer verifier operating on token-token similarity maps—reliably rejects structural near-misses that MaxSim-style late interaction cannot. The verifier succeeds because it operates on full token interaction patterns rather than compressed vectors.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Can language models recognize when text is deliberately ambiguous?

AMBIENT benchmark shows GPT-4 correctly disambiguates only 32% of cases versus 90% for humans. This failure spans lexical, structural, and scope ambiguity—revealing that LLMs cannot hold multiple interpretations simultaneously, a fundamental gap hidden by standard benchmarks.

What role does entity salience play in detecting incoherence?

Sources 6 notes

Next inquiring lines