LLM Reasoning and Architecture

Can reasoning stay grounded without external feedback loops?

Explores whether language models can maintain accurate reasoning through their own internal chains of thought, or whether they need real-world feedback to avoid hallucination and error propagation.

Note · 2026-02-22 · sourced from Reasoning Architectures

Pure chain-of-thought reasoning is a static black box: the model uses its own internal representations to generate each reasoning step, with no external correction mechanism. When an early step hallucinates or drifts, subsequent steps build on the error — error propagation is the structural consequence of having no feedback loop to reality.

ReAct addresses this by interleaving two kinds of operations:

The interleaving is tightly coupled: reasoning identifies what information is needed, action retrieves it, reasoning interprets it and updates the plan. This is not reasoning first then acting — it is continuous mutual conditioning where each reasoning step can trigger an action, and each action result reshapes the next reasoning step.

Empirical results: On knowledge-intensive QA (HotpotQA, Fever) where pure CoT hallucinates and propagates errors, ReAct's Wikipedia API interaction allows real-time fact-checking and error correction. On interactive decision making (ALFWorld, WebShop), ReAct outperforms imitation and reinforcement learning methods by 34% and 10% absolute success rate respectively, with only 1-2 in-context examples.

The mechanism: Human "inner speech" plays this role — verbal reasoning supports working memory, tracks state, handles exceptions. ReAct externalizes this to allow fact-grounding of reasoning content, not just structural organization of reasoning steps.

This is the foundational architectural pattern that subsequent designs either extend (ReWOO separating planning from execution) or abstract from (CoA using abstract placeholders instead of waiting for real responses). Understanding what ReAct prevents (error propagation from ungrounded chains) explains why architectural evolution moved toward earlier separation of planning from execution.


Source: Reasoning Architectures

Related concepts in this collection

Concept map
17 direct connections · 178 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

interleaved reasoning and action prevents hallucination by grounding reasoning traces in external world feedback rather than model-internal associations