Reinforcement Learning for LLMs LLM Reasoning and Architecture

Does revising your own reasoning actually help or hurt?

Self-revision in reasoning models often degrades accuracy, while external critique improves it. Understanding what makes revision helpful or harmful could reshape how we design systems that need to correct themselves.

Note · 2026-02-21 · sourced from Test Time Compute

The self-revision literature contains an apparent contradiction. Critique-in-the-loop approaches (AutoMathCritique, Agent-R, Meta-Reasoner) show that revision guided by step-level feedback improves actor model performance. The reasoning model evidence shows that self-revision degrades accuracy — more revision tokens correlate with wrong answers, and smaller models primarily switch correct answers to incorrect during revision, not vice versa. Revision helps in one literature and harms in another.

The resolution: revision source is the determining variable, not the revision act itself.

Externally guided revision: A separate model — potentially better calibrated, trained on critique quality, operating with fresh context — evaluates the current response and provides correction signals. The actor revises against these signals. The quality of the revision is bounded by the quality of the external critic, which can be better than the actor's self-evaluation capacity.

Internally driven revision: The same model second-guesses its own output. The self-evaluation is bounded by the same uncertain capabilities that produced the uncertain output in the first place. A model that got an answer wrong does not have a reliable mechanism for knowing it got it wrong — if it did, it would not have produced the wrong answer. Internal revision therefore adds noise without a reliable correction signal.

Since Does a model improve by arguing with itself?, the mechanism for internal harm is confidence amplification rather than correction: the model does not revise toward correct answers, it revises toward more confidently stated incorrect ones. External debate prevents this by providing genuine challenge.

The practical implication for reasoning system design: do not rely on internal revision loops. If revision is needed, provide an external critic. Do critique models improve diversity during training itself? is the training-time version of the same principle — external critique is more valuable than self-critique across both training and inference.


Source: Test Time Compute

Related concepts in this collection

Concept map
20 direct connections · 165 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

revision source determines accuracy outcome — external critique-guided revision improves performance while internal self-assessment-driven revision degrades it