Do large language models actually perform iterative optimization?

Explores whether LLMs execute genuine numerical procedures like Newton-Raphson or instead pattern-match to memorized solution templates when solving constrained optimization problems.

Note · 2026-05-18 · sourced from Reasoning Architectures

The constraint-optimization study identifies the mechanism behind the 55-60% plateau directly. LLMs cannot actually perform Newton-Raphson iterations in their latent space. They cannot execute primal-dual updates, nor any other iterative numerical procedure that genuine optimization requires. When asked to do so, they fall back to what the paper calls "result guessing" — recognizing the problem as similar to a standard power grid (or financial dataset, or security scenario) and emitting values that pattern-match what a valid solution should look like.

The fallback is silent. The output is fluent, well-formatted, often plausible. It can pass surface-level inspection because the model has seen many examples of what answers in this domain look like. What it has not done is solve the problem. The constraint values are wrong in ways that physical or financial systems would actually reject.

This explains why scale, architecture, and training regime do not move the plateau. They improve the template but not the procedure. A larger model has seen more example solutions and can produce more convincing guesses. Reinforcement learning on outcome rewards reinforces the template-matching pattern. None of this installs the iterative-computation capability the problem requires.

The mechanism — pattern-match against memorized solution-shapes when genuine computation is required — generalizes beyond optimization. It is plausibly the same mechanism behind a class of mathematical-reasoning failures where models produce confidently wrong numerical answers that resemble the right shape. The category is "looks like a solution; is not derived from one."

Related concepts in this collection

Concept map

14 direct connections · 121 in 2-hop network ·medium cluster Open in graph ↗

Do large language models actually perform iterat… Do larger language models solve constrained optimi… Do reasoning models actually beat standard models … Do fine-tuned language models actually learn optim… Does chain-of-thought reasoning reveal genuine inf… What do models actually learn from chain-of-though…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Original note title

LLMs cannot execute iterative numerical methods in latent space and fall back to result guessing against memorized templates

Do large language models actually perform iterative optimization?

Related concepts in this collection

Related papers in this collection