Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

Paper · arXiv 2503.09567 · Published March 12, 2025

We first distinguish Long CoT from Short CoT and introduce a novel taxonomy to categorize current reasoning paradigms. (2) Next, we explore the key characteristics of Long CoT: deep reasoning, extensive exploration, and feasible reflection, which enable models to handle more complex tasks and produce more efficient, coherent outcomes compared to the shallower Short CoT. (3) We then investigate key phenomena such as the emergence of Long CoT with these characteristics, including overthinking, and test-time scaling, offering insights into how these processes manifest in practice. (4) Finally, we identify significant research gaps and highlight promising future directions, including the integration of multi-modal reasoning, efficiency improvements, and enhanced knowledge frameworks.

we first define and examine the distinctions between Long CoT and traditional Short CoT, focusing on the following key aspects: (1) Deep Reasoning, which requires a sufficient depth of logical processing to manage an extensive set of reasoning nodes; (2) Extensive Exploration, which involves generating parallel uncertain nodes and transitioning from known to unknown logic; and (3) Feasible Reflection, which involves feedback and refinement of logical connections. These characteristics enable Long CoT paradigms to integrate more intricate reasoning and accommodate a broader range of logical structures, ultimately leading to more efficient and coherent outcomes. Subsequently, we systematically explore the underlying explanations for key phenomena associated with Long CoT, such as its emergence, the overthinking phenomenon, inference time scaling during testing, and the "Aha Moment," among others. To our knowledge, This is the first comprehensive survey dedicated to these specific topics.