Reasoning and Knowledge Reasoning and Learning Architectures

Do reasoning models actually beat standard models on optimization?

Explores whether extended chain-of-thought in reasoning models delivers performance gains on constraint-satisfaction problems like power-grid optimization. Matters because reasoning models are treated as automatic upgrades, but the evidence may not support that claim.

Note · 2026-05-18 · sourced from Reasoning Architectures

Reasoning models have been treated as a generalized capability upgrade — more thinking tokens at test time, broadly better performance. On constraint-bound numerical optimization the upgrade does not materialize. Reasoning variants do not systematically outperform their non-reasoning counterparts on power-grid, financial-operations, or cyber-security feasibility problems. The longer trace does not become a longer iteration.

The reason this matters: extended chain-of-thought looks like it should help. The problem involves multi-step arithmetic, interacting constraints, and convergence-style reasoning — exactly the regime where "think more" is supposed to pay. The data say it does not. Whatever extended CoT is doing on these tasks, it is not running a Newton-Raphson iteration or a primal-dual update in latent space; it is producing more text without producing more computation.

This is consistent with a growing view that reasoning models excel where the bottleneck is exploration over reasoning paths (math contests, code, multi-hop QA) but stall where the bottleneck is numeric procedure. Constraint satisfaction over real physical systems is the latter. Adding chain length adds search over verbal restatements of the problem, not iterations of the algorithm that would solve it.

The implication for product: choosing "reasoning model" for an optimization-heavy workflow is not automatically the right call. The relevant decision is whether the bottleneck is verbal reasoning or numeric computation. If numeric, the cost-effective path is hand-off to a solver, not more thinking tokens.

Related concepts in this collection

Concept map
14 direct connections · 137 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

reasoning models do not systematically outperform non-reasoning models on real numerical optimization — extended chain-of-thought is not a substitute for iterative computation