Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Paper · arXiv 2405.06682 · Published May 5, 2024

To improve their performance, we can provide them with a series of cognitive capabilities. For example, we can provide them with a CoT [1–3], access to external memory [22–25], and the ability to learn from feedback [10, 18, 19]. Learning from feedback can be decomposed into several components. These components include the source of the feedback, the type of feedback, and the strategy used to learn from feedback [11]. There are two sources of feedback (i.e., internal or external feedback) and two main types of feedback (i.e., scalar values or natural language) [11, 12].

There are also several strategies for learning from feedback. These strategies depend on where they occur in the LLM’s output-generation process. They can occur at model-training time, output-generation time, or after the output has been generated. Within each of these three phases, there are various techniques available (e.g., model fine-tuning, output re-ranking, and self-correction) [11].

In terms of learning from self-correction, various methods are currently being investigated. These include iterative refinement, multi-model debate, and self-reflection [11].

Self-reflection in LLM agents is a metacognitive strategy also known as introspection [13, 14]. Some research studies have indicated that LLMs using self-reflection are able to identify and correct their mistakes [8, 10, 12, 15]. Others have indicated that LLMs cannot identify errors in their reasoning; regardless, they still may be able to correct them with external feedback [7, 26].

We investigated eight types of self-reflecting LLM agents. These agents reflect upon their own CoT and then generate self-reflections to use when attempting to re-answer questions. Each of these agents uses a unique type of self-reflection to assist it. We also included a single non-reflecting (i.e., Baseline) agent as our control. Listed below are the various types of agents and the type of self-reflection they generate and use to re-answer questions:

• Baseline - no self-reflection capabilities.

• Retry - informed that it answered incorrectly and simply tries again.

• Keywords - a list of keywords for each type of error.

• Advice - a list of general advice for improvement.

• Explanation - an explanation of why it made an error.

• Instructions - an ordered list of instructions for how to solve the problem.

• Solution - a step-by-step solution to the problem.

• Composite - all six types of self-reflections.

• Unredacted - all six types without the answers redacted