Reflexion: an autonomous agent with dynamic memory and self-reflection

Paper · arXiv 2303.11366 · Published March 20, 2023

Recent advancements in decision-making large language model (LLM) agents have demonstrated impressive performance across various benchmarks. However, these state-of-the-art approaches typically necessitate internal model fine-tuning, external model fine-tuning, or policy optimization over a defined state space. Implementing these methods can prove challenging due to the scarcity of high-quality training data or the lack of well-defined state space. Moreover, these agents do not possess certain qualities inherent to human decision-making processes, specifically the ability to learn from mistakes. Self-reflection allows humans to efficiently solve novel problems through a process of trial and error. Building on recent research, we propose Reflexion, an approach that endows an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities.

A binary reward model is a type of reward function that assigns a value of 0 or 1 to an action taken by the agent in the current state. 1 indicates a successful outcome and 0 indicates an unsuccessful outcome. A binary reward function was chosen to tightly constrain the agent’s knowledge to the observations from the world and the status of success or failure in a given environment, as opposed to a multi-value or continuous output from a more descriptive reward model with which the agent may use to evaluate its current performance.

we defined hallucination as the occurrence of two or more consecutive identical actions in which the environment responded with the same observation and inefficient planning as the occurrence of a trajectory in which the agent executed more than 30 actions without reaching a successful state.

Reflexion describes a highly applicable approach that may be used to equip agents to solve a variety of complex tasks. A Reflexion agent must have access to a heuristic for termination and a binary reward model.