Self-Refine: Iterative Refinement with Self-Feedback
Motivated by how humans refine their written text, we introduce SELF-REFINE, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLM; then, the same LLM provides feedback for its output and uses it to refine itself, iteratively. SELF-REFINE does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner and the feedback provider.
We present SELF-REFINE: an iterative self-refinement algorithm that alternates between two generative steps–FEEDBACK and REFINE. These steps work in tandem to generate high-quality outputs. Given an initial output generated by a model M, we pass it back to the same model M to get feedback. Then, the feedback is passed back to the same model to refine the previously-generated draft. This process is repeated either for a specified number of iterations or until M determines that no further refinement is necessary.
an individual may initially write a direct request such as “Send me the data ASAP”. Upon reflection, however, the writer recognizes the potential impoliteness of the phrasing and revises it to “Hi Ashley, could you please send me the data at your earliest convenience?". When writing code, a programmer may implement an initial “quick and dirty” implementation, and then, upon reflection, refactor their code to a solution that is more efficient and readable. In this paper, we demonstrate that LLMs can provide iterative self-refinement without additional training, leading to higher-quality outputs on a wide range of tasks.