Reasoning and Learning Architectures Reasoning and Knowledge

Can reasoning systems forget history without losing coherence?

Does treating each reasoning step as independent—rather than accumulating historical context—actually preserve problem-solving quality while reducing computational waste? This explores whether Markov-style memoryless reasoning can scale effectively.

Note · 2026-05-18 · sourced from Reinforcement Learning

Existing test-time scaling methods all carry history along. Chain-based methods preserve the entire reasoning trace to generate each next step. Tree-based methods track ancestor and sibling relationships across branches. Graph-based methods compound this with arbitrary node dependencies. As reasoning scales, the accumulated historical dependencies waste compute and — worse — interfere with the model's ability to reason effectively on the current state.

Atom of Thoughts (2502.12018) makes a different bet: each reasoning state should be a simplified problem equivalent to the original, with partial reasoning steps either transformed into known conditions or excluded as incorrect explorations. The state transition mechanism has two phases. First, decompose the current question into a dependency-based directed acyclic graph (DAG) capturing structural information. Second, contract the subquestions into a new independent question. Iterate the decomposition-contraction until reaching directly-solvable atomic questions.

The Markov property is the load-bearing claim. Each transition depends only on the current state — never on the path that produced it. This is not a heuristic; it is a structural property guaranteed by answer-equivalence preservation through contraction. If the contracted question yields the same answer as the original, no historical context is required to continue.

The cognitive science motivation is direct. Humans solve complex problems by identifying and resolving self-evident subquestions, then reformulating a simplified problem state — not by maintaining detailed reasoning processes for resolved components. The reformulation IS the memory management.

Two architectural advantages emerge. AoT eliminates the need for maintaining and computing historical information when scaling test-time compute, and atomic questions can be seamlessly integrated into existing TTS frameworks as a plug-in enhancement. Since Can recursive subtask trees overcome context window limits?, AoT is the language-level version of the same insight — TIMRUN prunes KV cache to free positional embeddings; AoT contracts subproblems to free conceptual context. Both reject the assumption that more history equals better reasoning.

Paper: Atom of Thoughts for Markov LLM Test-Time Scaling

Related concepts in this collection

Can recursive subtask trees overcome context window limits? Explores whether modeling reasoning as prunable trees of subtasks could eliminate the context length constraints that currently force developers into multi-agent architectures. Asks if working memory can become truly unlimited through selective KV cache retention.
TIMRUN does the same at the KV-cache layer; AoT does it at the conceptual layer; both reject history-accumulating reasoning
Why does parallel reasoning outperform single chain thinking? Does dividing a fixed token budget across multiple independent reasoning paths beat spending it all on one long chain? This explores how breadth and diversity in reasoning compare to depth.
AoT can compose with parallel sampling once each branch is memoryless
Do iterative refinement methods suffer from overthinking? Iterative refinement approaches like Self-Refine structurally resemble token-level overthinking in o1-like models. Does revision across multiple inference calls reproduce the same accuracy degradation seen within single inferences?
AoT is the structural fix that PDR's bounded-workspace also targets: bounded state via contraction rather than via summarization

Concept map

12 direct connections · 128 in 2-hop network ·dense cluster Open in graph ↗

Can reasoning systems forget history without los… Can recursive subtask trees overcome context windo… Why does parallel reasoning outperform single chai… Do iterative refinement methods suffer from overth…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Original note title

markov-style memoryless reasoning replaces accumulated-history test-time scaling with iterative decompose-then-contract

Can reasoning systems forget history without losing coherence?

Related concepts in this collection

Related papers in this collection