LLM Reasoning and Architecture Language Understanding and Pragmatics

Can language models understand without actually executing correctly?

Do LLMs truly comprehend problem-solving principles if they consistently fail to apply them? This explores whether the gap between articulate explanations and failed actions points to a fundamental architectural limitation.

Note · 2026-02-23 · sourced from Flaws
How do LLMs fail to know what they seem to understand?

LLMs display surface fluency yet systematically fail at tasks requiring symbolic reasoning, arithmetic accuracy, and logical consistency. The diagnosis: a persistent gap between comprehension and competence, rooted not in knowledge access but in computational execution.

The paper names this "computational split-brain syndrome" — instruction and action pathways are geometrically and functionally dissociated within the model. The model can articulate the correct principle for how to solve a problem, then fail to apply that principle in the next step. This is not forgetting, not hallucination, not knowledge deficit — it is a structural disconnect between knowing-how-to-describe and knowing-how-to-do.

The failure recurs across domains: mathematical operations, relational inferences, logical deductions. The consistency across domains suggests an architectural rather than domain-specific cause. LLMs function as powerful pattern completion engines but lack the scaffolding for principled, compositional reasoning — structure for executing what they can describe.

This provides a mechanistic name for Can LLMs understand concepts they cannot apply?. Potemkin understanding names the phenomenon; computational split-brain names the mechanism. The geometric separation between instruction representations and execution pathways explains why the model can generate correct explanations and incorrect applications simultaneously without detecting the inconsistency.

It also concretizes Why do language models fail to act on their own reasoning?. The 87% vs 64% gap is the quantitative signature of the split-brain: the instruction pathway (rationale generation) and the execution pathway (action selection) draw on overlapping but dissociated representations.

The paper further argues that mechanistic interpretability findings may reflect training-specific pattern coordination rather than universal computational principles — the internal structures we discover may be execution artifacts, not reasoning architecture.

Planning as the paradigmatic test case. The 8-puzzle study (On the Limits of Innate Planning in Large Language Models) isolates two specific deficits: (1) brittle internal state representations leading to frequent invalid moves, and (2) weak heuristic planning with models entering loops or selecting actions that don't reduce distance to the goal. Even with an external move validator providing only valid moves, none of the models solve any puzzles. The comprehension-competence split is stark: models can articulate puzzle-solving strategies but cannot maintain accurate state representations across sequential moves. Since Can large language models actually create executable plans?, the gap widens with task complexity: 87% correct rationales → 64% correct actions → 12% executable plans → 0% puzzle solutions with validator assistance.


Source: Flaws

Related concepts in this collection

Concept map
15 direct connections · 137 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

comprehension without competence is a distinct LLM failure mode — instruction and execution pathways are dissociated