Language Understanding and Pragmatics LLM Reasoning and Architecture

Do large language models reason symbolically or semantically?

Can LLMs follow explicit logical rules when those rules contradict their training knowledge? Testing whether reasoning operates independently of semantic associations reveals what computational mechanisms actually drive LLM multi-step inference.

Note · 2026-02-22 · sourced from Reasoning Logic Internal Rules
What makes chain-of-thought reasoning actually work? How do LLMs fail to know what they seem to understand? How should researchers navigate LLM reasoning research?

The "In-Context Semantic Reasoners" paper tests a fundamental question about what drives LLM reasoning by systematically decoupling semantics from the reasoning process across deduction, induction, and abduction tasks. The findings are clear: when semantics are consistent with commonsense, LLMs perform well; when semantics are removed or made counter-commonsense, performance collapses even when correct rules are provided in context.

The experimental design is precise. By replacing relation labels with shuffled alternatives ("motherOf" → "sisterOf", "female" → "male"), the researchers create tasks where the in-context rules are logically valid but semantically counter-intuitive. LLMs cannot follow these counter-commonsense rules despite having them explicitly in the prompt. The model's parametric knowledge — its compressed commonsense from training — overrides the in-context logical structure.

This reveals a specific computational mechanism: LLMs create "superficial logical chains" through semantic token associations, not through symbolic manipulation. The connections between tokens that enable multi-step reasoning are semantic connections, not logical ones. When those semantic connections support the correct answer, reasoning appears to work. When they conflict, reasoning fails regardless of what the prompt says.

The implication is that LLM reasoning is fundamentally bounded by training distribution semantics. Since Can large language models translate natural language to logic faithfully?, the failure is bidirectional: LLMs can neither translate TO formal logic faithfully nor reason FROM formal logic when it conflicts with semantic priors. Since Do foundation models learn world models or task-specific shortcuts?, the semantic dependency IS the heuristic — the model uses semantic similarity as a proxy for logical validity.

This connects to the Dual Process Theory framework: human System II symbolic reasoning operates independently of semantic content, but LLM "reasoning" remains entangled with System I semantic associations. The paper's suggestion — integrating LLMs with external non-parametric knowledge bases and improving in-context knowledge processing — implicitly acknowledges that the LLM alone cannot escape this limitation.

Retort implication — rules out a class of anthropomorphization: The finding constrains what we can say about LLM behavior in other domains. Any account that treats LLMs as agents who "reverse-engineer" justifications for conclusions they have committed to — the standard anthropomorphization of sycophancy, rationalization, or motivated reasoning — presupposes the semantic competence this note shows LLMs lack. If reasoning collapses when semantics are decoupled, there is no separable reasoning faculty available to perform a post-hoc rationalization. What looks like reverse-engineering is pattern-matching within semantic associations. This rules out a whole class of AI commentary that treats LLMs as dishonest agents who could have reasoned correctly but chose not to.

Metaphor as paradigmatic semantic decoupling: Metaphor is the literary instantiation of this finding. A metaphor works by using one domain's vocabulary to illuminate another — "time is money," "argument is war," "memory is a jar of flies." The decoupling between the source domain's semantics and the target domain's meaning is the defining feature of metaphorical language. Since LLM reasoning collapses when semantics are decoupled from their typical packaging, and metaphor is decoupled semantics, this predicts a specific failure mode: LLMs should handle conventional metaphors (lexicalized, semantically consistent with commonsense) better than novel literary metaphors (where the mapping between domains is unexpected and requires conceptual reasoning beyond semantic association). The Diplomat dataset (Diplomat: A Dialogue Dataset for Situated PragMATic Reasoning) suggests treating all figurative language as a unified pragmatic reasoning task — but the semantic-decoupling finding predicts that this unified approach will hit a wall at the novelty threshold where metaphors stop relying on conventional semantic associations.


Source: Reasoning Logic Internal Rules; enriched from inbox/research-brief-llm-literary-analysis-2026-03-02.md

Related concepts in this collection

Concept map
23 direct connections · 174 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

llms are in-context semantic reasoners not symbolic reasoners — when semantics are decoupled reasoning collapses