LLM Reasoning and Architecture Design & LLM Interaction

Why do some questions perform better without step-by-step reasoning?

Explores whether chain-of-thought prompting universally improves reasoning or if simpler prompts work better for certain questions. Understanding this matters because it challenges assumptions about how LLMs should be prompted to solve problems.

Note · 2026-03-28 · sourced from Prompts Prompting

"Instance-adaptive Zero-shot Chain-of-Thought Prompting" (2024) uses neuron saliency score analysis to detect the mechanism underlying zero-shot CoT — why some prompts work for some instances and fail for others.

The finding: successful reasoning requires a specific information flow pattern across three components (question q, prompt p, rationale r). First, semantic information from the question must aggregate to the prompt. Then, reasoning steps must gather information from both the original question directly AND the synthesized question-prompt semantic information. When this flow is disrupted — when the prompt does not absorb question semantics, or when the rationale ignores the question — reasoning fails.

The practical consequence is striking: "Don't think. Just feel." — generally regarded as a less favorable prompt — outperforms "Let's think step by step" on some simple questions. The step-by-step prompt can guide the LLM into bad reasoning on questions that could be straightforwardly answered. This is not random noise; the saliency analysis shows WHY: for simple questions, the step-by-step prompt introduces unnecessary intermediate structure that disrupts the direct question-to-answer information flow.

This extends Why do chain-of-thought examples fail across different conditions? from exemplar-level brittleness to instance-level brittleness. The problem is not just that different exemplars produce different results — it's that the same prompt is fundamentally inappropriate for a subset of instances. Since When does explicit reasoning actually help model performance?, the instance-adaptive finding provides the information-flow mechanism: logical derivation tasks route well through the prompt-mediated pathway, while simpler or judgment-based tasks are disrupted by it.

The implication for reasoning model design: a single universal reasoning prompt is a design error. The optimal prompt depends on the specific question-prompt interaction, not on the task category. Since When should an agent actually stop and deliberate?, the instance-adaptive finding extends the principle from "when to deliberate" to "how to deliberate" — the form of reasoning must adapt to the question, not just the decision of whether to reason.

Source: Prompts Prompting

Related concepts in this collection

Why do chain-of-thought examples fail across different conditions? Chain-of-thought exemplars show surprising sensitivity to order, complexity level, diversity, and annotator style. Understanding these brittleness dimensions could reveal what makes reasoning prompts robust or fragile.
extends brittleness from exemplar-level to instance-level; same prompt fails on different instances
When does explicit reasoning actually help model performance? Explicit reasoning improves some tasks but hurts others. What determines whether step-by-step reasoning chains are beneficial or harmful for a given problem?
information-flow mechanism explains why: logical tasks route through prompt mediation, judgment tasks are disrupted
When should an agent actually stop and deliberate? How can models detect when deliberation over action choices is genuinely needed versus wasteful? This matters because unbounded action spaces make universal deliberation intractable, yet skipping it entirely risks missing critical errors.
extends "when to think" to "how to think" for each instance
How much does demo position alone affect in-context learning accuracy? Moving demonstrations from prompt start to end without changing their content produces surprisingly large accuracy swings. Does spatial position in the prompt matter more than what demonstrations actually contain?
another instance of prompt structure mattering more than content

Concept map

14 direct connections · 125 in 2-hop network ·medium cluster

Why do some questions perform better without ste… Why do chain-of-thought examples fail across diffe… When does explicit reasoning actually help model p… When should an agent actually stop and deliberate? How much does demo position alone affect in-contex…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

instance-adaptive prompting reveals that successful zero-shot CoT requires question-to-prompt information flow — some instances perform better without step-by-step reasoning