Reasoning Models Can Be Effective Without Thinking

Paper · arXiv 2504.09858 · Published April 14, 2025
Reasoning Critiques

Recent LLMs have significantly improved reasoning capabilities, primarily by including an explicit, lengthy Thinking process as part of generation. In this paper, we question whether this explicit thinking is necessary. Using the state-of-the-art DeepSeek-R1-Distill-Qwen, we find that bypassing the thinking process via simple prompting, denoted as NoThinking, can be surprisingly effective. When controlling for the number of tokens, NoThinking outperforms Thinking across a diverse set of seven challenging reasoning datasets—

Most modern reasoning models, such as R1 and R1-Distill-Qwen, follow a similar structure in their generation process: the reasoning process within the thinking box, marked by <|beginning of thinking|> and <|end of thinking|>, followed by the final answer. Based on this structure, we define the two methods (Thinking and NoThinking) as follows. Thinking refers to the default method of querying the reasoning models to produce the following outputs: the reasoning process within the thinking box, the final solution, and the final answer (Figure 1 (blue)). NoThinking refers to a method that bypasses the explicit reasoning process through prompting, directly generating the final solution and answer. This is achieved by forcing the thinking box to be empty during the decoding process (Figure 1 (orange)). <|beginning of thinking|> Okay, I think I have finished thinking. <|end of thinking|> The exact prompts we use can be found in Appendix C.