LLM Reasoning and Architecture Reinforcement Learning for LLMs

When does sequential reasoning beat parallel voting?

Explores whether sequential chain-of-thought reasoning or parallel voting is more effective for different problem types. Understanding this trade-off helps predict which test-time compute strategy will work best.

Note · 2026-02-22 · sourced from Reasoning Methods CoT ToT
How should we allocate compute budget at inference time?

The prevailing empirical finding is that parallel sampling outperforms sequential extension under fixed token budgets (see Why does parallel reasoning outperform single chain thinking?). The "Let Me Think!" paper identifies a class of problems where this reverses — and the reversal is exponential, not marginal.

The setting: graph connectivity tasks, where the model must determine whether vertices are connected by stepping through several edges. This is a proxy for structured multi-step reasoning — any problem where sub-results must be sequentially composed and the correct solution path has a specific depth structure. For these tasks:

The exponential gap arises because graph connectivity is computationally sequential at its core — bounded-depth transformers struggle with it exactly because they cannot perform arbitrarily deep sequential computation in a single forward pass. CoT, by externalizing intermediate steps into the context window, effectively increases the depth available.

This is a fundamental qualification of the parallel-wins claim, not a contradiction of it. The reconciliation is task structure:

The practical heuristic: if solving a shorter version of the problem would not give useful information toward the longer version, parallel sampling is ineffective — each short chain is simply an incomplete attempt. Sequential extension is the only way forward.


Source: Reasoning Methods CoT ToT

Related concepts in this collection

Concept map
12 direct connections · 126 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

sequential cot offers exponential advantage over parallel voting on structured compositional problems