From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step

Paper · arXiv 2405.14838 · Published May 23, 2024

When leveraging language models for reasoning tasks, generating explicit chainof- thought (CoT) steps often proves essential for achieving high accuracy in final outputs. In this paper, we investigate if models can be taught to internalize these CoT steps. To this end, we propose a simple yet effective method for internalizing CoT steps: starting with a model trained for explicit CoT reasoning, we gradually remove the intermediate steps and finetune the model. This process allows the model to internalize the intermediate reasoning steps, thus simplifying the reasoning process while maintaining high performance. Our approach enables a GPT-2 Small model to solve 9-by-9 multiplication with up to 99% accuracy, whereas standard training cannot solve beyond 4-by-4 multiplication.

In this work, we examine the possibility of internalizing the reasoning process in the model’s hidden states. We propose an approach, Stepwise Internalization, which begins with a model trained for explicit CoT reasoning. We then gradually remove the intermediate steps and finetune the model, forcing it to internalize the reasoning process. Once all intermediate steps are internalized, we achieve a model capable of full implicit CoT reasoning. Moreover, even in cases where the model does not have the capacity for full implicit CoT reasoning, this method still allows for shortening the reasoning chain while maintaining accuracy.

Our approach is an alternative to the approach proposed by Deng et al. [6], which shares the goal of implicitly reasoning using the hidden states of transformers instead of relying on explicit CoT tokens. To teach the model to use hidden states for reasoning, that method employs a teacher model that performs explicit CoT reasoning, and then distills the teacher’s hidden states into the student model’s hidden states. In comparison, our approach is much simpler yet more effective. Our approach demonstrates significant improvements over standard training methods. For instance,