Reasoning and Learning Architectures Reasoning and Knowledge

Can reasoning systems forget history without losing coherence?

Does treating each reasoning step as independent—rather than accumulating historical context—actually preserve problem-solving quality while reducing computational waste? This explores whether Markov-style memoryless reasoning can scale effectively.

Note · 2026-05-18 · sourced from Reinforcement Learning
How should reasoning systems actually be architected? How should we allocate compute budget at inference time?

Existing test-time scaling methods all carry history along. Chain-based methods preserve the entire reasoning trace to generate each next step. Tree-based methods track ancestor and sibling relationships across branches. Graph-based methods compound this with arbitrary node dependencies. As reasoning scales, the accumulated historical dependencies waste compute and — worse — interfere with the model's ability to reason effectively on the current state.

Atom of Thoughts (2502.12018) makes a different bet: each reasoning state should be a simplified problem equivalent to the original, with partial reasoning steps either transformed into known conditions or excluded as incorrect explorations. The state transition mechanism has two phases. First, decompose the current question into a dependency-based directed acyclic graph (DAG) capturing structural information. Second, contract the subquestions into a new independent question. Iterate the decomposition-contraction until reaching directly-solvable atomic questions.

The Markov property is the load-bearing claim. Each transition depends only on the current state — never on the path that produced it. This is not a heuristic; it is a structural property guaranteed by answer-equivalence preservation through contraction. If the contracted question yields the same answer as the original, no historical context is required to continue.

The cognitive science motivation is direct. Humans solve complex problems by identifying and resolving self-evident subquestions, then reformulating a simplified problem state — not by maintaining detailed reasoning processes for resolved components. The reformulation IS the memory management.

Two architectural advantages emerge. AoT eliminates the need for maintaining and computing historical information when scaling test-time compute, and atomic questions can be seamlessly integrated into existing TTS frameworks as a plug-in enhancement. Since Can recursive subtask trees overcome context window limits?, AoT is the language-level version of the same insight — TIMRUN prunes KV cache to free positional embeddings; AoT contracts subproblems to free conceptual context. Both reject the assumption that more history equals better reasoning.


Paper: Atom of Thoughts for Markov LLM Test-Time Scaling

Related concepts in this collection

Concept map
12 direct connections · 128 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

markov-style memoryless reasoning replaces accumulated-history test-time scaling with iterative decompose-then-contract