LLM Reasoning and Architecture

Can reasoning topologies be formally classified as graph types?

This explores whether Chain of Thought, Tree of Thought, and Graph of Thought represent distinct formal graph structures with different computational properties. Understanding this matters because the topology itself determines what reasoning strategies are possible.

Note · 2026-02-22 · sourced from Reasoning Methods CoT ToT
How should we allocate compute budget at inference time?

The "Demystifying Chains, Trees, and Graphs of Thoughts" paper imposes formal graph-theoretic structure on the reasoning topology landscape:

The topology is not just metaphor — it is the actual computational structure of the reasoning process, expressed in how the LLM context is managed and extended over time. As the authors put it, the representation becomes "to a degree spatial" as the graph is serialized into the LLM context.

The GoT case is the most generative: allowing in-degree > 1 means a thought can synthesize two previous reasoning paths. This enables the model to pursue two independent sub-problems and combine them — something that neither chains nor trees allow. The practical implication is that GoT can express divide-and-conquer strategies that trees cannot, because in a tree each node has exactly one parent.

Multi-paradigm chaining as a new topology (Chain-of-Reasoning/CoR): CoR chains Natural Language Reasoning (NLR), Algorithmic Reasoning (AR), and Symbolic Reasoning (SR) together. Each paradigm can independently derive an answer; CoR synthesizes them into a final solution. Progressive Paradigm Training (PPT) allows models to master paradigms sequentially, achieving zero-shot cross-task generalization. This is a distinct topology type: not branching (like ToT) or aggregation of parallel identical chains (like CoT-SC), but heterogeneous-paradigm chaining where each node uses a different reasoning modality. The practical value: different paradigms excel at different problem aspects (NLR for semantic context, AR for execution, SR for formal proofs), and chaining captures complementary strengths.

This taxonomy clarifies why Why does parallel reasoning outperform single chain thinking? — CoT-SC (multiple parallel paths) occupies a structurally different topology from sequential CoT, and the advantage comes from the topology itself, not from inference tricks. It also grounds the hybrid methods (beam search, MCTS) as implementations that traverse trees with pruning rather than exhaustive expansion.

GoT as Knowledge Graph reasoning externalization (KGoT): The GoT topology finds a concrete implementation in KGoT, which externalizes LLM reasoning into a knowledge graph of interconnected triples. Each reasoning step adds, updates, or queries triples in a structured graph — providing two capabilities that standard GoT lacks: (1) an inspection mechanism where the graph's contents can be reviewed and validated at any point, enabling noise mitigation and error correction, and (2) graph updating where new evidence can modify existing triples, enabling iterative refinement. KGoT achieves 29% improvement on GAIA benchmarks using GPT-4o mini at a fraction of large model cost — demonstrating that GoT's aggregation topology becomes economically transformative when the graph is persistent and structured rather than ephemeral. See Can externalizing reasoning into knowledge graphs help smaller models compete?.

Molecular bond taxonomy as complementary interaction-level classification: The Molecular Structure of Thought paper provides a third taxonomic dimension orthogonal to both external topology and internal hidden-state topology. It classifies interactions between reasoning segments by bond type: Deep-Reasoning (covalent bonds — dense local deduction clusters), Self-Reflection (hydrogen bonds — long-range corrective links), and Self-Exploration (van der Waals forces — weak bridges between distant clusters). Where the graph topology taxonomy classifies the shape of reasoning (chain, tree, graph), the molecular taxonomy classifies the forces between reasoning elements. The practical implication: "semantic isomers" — traces with the same semantic content but different bond distributions from different teachers — destabilize distillation even with matched token statistics. This means effective reasoning traces require not just the right topology (GoT > CoT) but the right distribution of interaction types within that topology. See Does long chain of thought reasoning follow molecular bond patterns?.

Internal hidden-state topology (Topology of Reasoning): This taxonomy covers external reasoning graph structure (how context is managed). A complementary internal dimension exists: reasoning graphs extracted from hidden-state representations at each step. Distilled reasoning models show ~5 cycles per sample (vs near-zero in base models), larger graph diameters, and ~6x higher small-world index. These internal topology properties correlate with accuracy. Cycles = "aha moments" (reconsideration events). Overthinking = redundant cycles. Underthinking = excessive diameter without sufficient backtracking cycles. This adds a mechanistic layer: external topology (CoT=chain, ToT=tree, GoT=graph) describes what reasoning structures are built; internal topology describes what dynamics occur in the model's representation space during reasoning. See Do reasoning cycles in hidden states reveal aha moments?.

Recursive subtask trees with KV cache pruning (TIM, 2507.16784): The Thread Inference Model implements tree-of-thought reasoning at the infrastructure level. Reasoning trajectories are modeled as recursive trees of subtasks — higher-level nodes decompose into simpler subtasks until reaching leaf nodes that are completable in one step. The key hypothesis: processing an intermediate task does not need to attend to the completed subtasks of previous steps. A KV cache management system retains only relevant context tokens via subtask-pruning, enabling positional embedding reuse and GPU memory recycling. The system sustains high throughput while manipulating up to 90% of the KV cache. This is a concrete implementation of tree topology where branching is driven by task decomposition and pruning is driven by completion — adding an infrastructure dimension to the formal graph classification. See Can recursive subtask trees overcome context window limits?.

Agent systems as optimizable graphs (from Arxiv/Agents): The Language Agents as Optimizable Graphs framework extends graph representation from individual reasoning traces to entire agent systems. Nodes implement functions (LLM inference, tool use, function calls), edges describe information flow, and composite graphs represent multi-agent swarms. This adds a crucial dimension: the graph topology is not merely descriptive but optimizable. Node optimization refines prompts based on task feedback; edge optimization changes connectivity, allowing suboptimal agent organization to be automatically recombined. CoT, ToT, Reflexion, and Self-Consistency are all representable within this framework. The Society of Mind (Minsky 1988) hierarchy — nodes → agent graphs → swarm composite graphs — means the same formalism covers both single-agent reasoning traces and multi-agent orchestration. See Can we automatically optimize both prompts and agent coordination?.


Source: Reasoning Methods CoT ToT, Agents, Memory

Related concepts in this collection

Concept map
23 direct connections · 190 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

reasoning topology taxonomy classifies cot tot and got as formal graph types