Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

Paper · arXiv 2508.12265 · Published August 17, 2025
RoutersNovel Architectures

Large Language Models (LLMs) have demonstrated remarkable progress in reasoning across diverse domains. However, effective reasoning in real-world tasks requires adapting the reasoning strategy to the demands of the problem, ranging from fast, intuitive responses to deliberate, step-by-step reasoning and tool-augmented thinking. Drawing inspiration from cognitive psychology, we propose a novel taxonomy of LLM reasoning strategies along two knowledge boundaries: (1) a fast/slow boundary separating intuitive from deliberative processes, and (2) an internal/external boundary distinguishing reasoning grounded in the model’s parameters from reasoning augmented by external tools. We systematically survey recent work on adaptive reasoning in LLMs and categorize methods based on key decision factors. We conclude by highlighting open challenges and future directions toward more adaptive, efficient, and reliable LLMs.

Fast thinking: Analogous to human System 1 cognition, fast thinking involves intuitive responses generated directly from the model without explicit intermediate steps [32]. Fast thinking offers low latency and high throughput, but may prone to errors in unfamiliar or complex problems.

Slow thinking: Similar to human System 2 cognition, slow thinking engages in step-by-step, deliberative reasoning. Techniques such as chain-of-thought prompting [46], self-reflection [10], and intermediate verification [22] enable the model to decompose problems and validate intermediate results, thereby improving performance on more challenging tasks.

Tool-augmented thinking: The tool-augmented thinking extends beyond innate model capabilities by incorporating external tools (e.g., calculators, code interpreters, search engines, or knowledge databases). It parallels humans’ use of external aids to fill knowledge gaps, achieve precise computation, or access up-to-date information.

However, the complexity and diversity of real-world applications demand that LLMs flexibly adapt their reasoning strategy to each task’s accuracy and latency constraints. Hence, selecting and shifting between the appropriate reasoning strategies of LLMs becomes an active and growing area of research [61, 14].

We group existing approaches into two categories: implicit and explicit selection. Implicit selection refers to models that internally determine reasoning strategies, typically learned end-to-end during post-training. At inference, the model selects the reasoning strategies without explicit control signals (i.e., the control signal c in Eq. (1) is omitted). In contrast, explicit selection involves external routing mechanisms, such as predefined rules, prompts, or router networks.

4.2 Explicit Selection

Explicit selection introduces external mechanisms that explicitly guide the model in choosing an appropriate reasoning strategy. We further categorize existing methods into rule-based and model-based selecting approaches.