LLM Reasoning and Architecture Reinforcement Learning for LLMs

What critical thinking skills do reasoning models actually lose?

Step-by-step reasoning training optimizes narrow deductive thinking while degrading meta-cognitive abilities like recognizing futile thinking and maintaining tentative reasoning. Understanding this tradeoff matters for deploying reasoning models reliably.

Note · 2026-02-22 · sourced from Reasoning Critiques
How should we allocate compute budget at inference time? What kind of thing is an LLM really?

Post angle: Medium

We trained AI to think. In doing so, we trained it not to think in two specific and important ways.

Failure mode 1: It can't recognize when thinking is futile

Give a reasoning model a question with a missing premise — a question that cannot be answered because essential information is absent. A non-reasoning model quickly produces a short response acknowledging the problem. A reasoning model produces a response five times longer, cycling through "alternatively," "wait," "but..." — generating elaborate chains that never converge because there's nothing to converge on.

Non-reasoning models have better critical thinking about when to think. Reasoning-specific training optimizes for using thinking patterns. It doesn't develop the meta-capability to disengage when engagement is inappropriate.

Failure mode 2: It reasons its way to the wrong rule

Give a reasoning model four games with hidden special rules. Non-reasoning models score 55-65% on those exception-based rules. Reasoning models score below 25%. The detailed thinking chains make things worse — models apply arithmetic to symbols, overgeneralize from two examples, or invent rules that weren't in the data.

Inductive reasoning from sparse, exception-containing observations requires a different kind of thinking: tentative, minimal, defeasible. The CoT pattern forces positive, elaborating chains that work against the task.

The pattern: Training for deductive, step-by-step reasoning improves that specific skill while degrading adjacent cognitive capabilities — the ability to disengage, the ability to remain tentative, the ability to recognize an exception rather than rationalize around it.

The implication: Reasoning models have a narrower cognitive profile than their benchmark performance suggests. The benchmarks are in-distribution, CoT-suited tasks. The real-world distribution also contains ill-posed questions, hidden rules, and problems where the correct response is to stop thinking.


Source: Reasoning Critiques

Related concepts in this collection

Concept map
13 direct connections · 118 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

the critical thinking problem — what reasoning models sacrifice when trained to think step by step