CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization
Language agents have shown some ability to interact with an external environment, e.g., a virtual world such as ScienceWorld, to perform complex tasks, e.g., growing a plant, without the startup costs of reinforcement learning. However, despite their zero-shot capabilities, these agents to date do not continually improve over time, beyond performance refinement on a specific task. Here we present CLIN, the first language-based agent to achieve this, so that it continually improves over multiple trials, including when both the environment and task are varied, and without requiring parameter updates. Our approach is to use a persistent, dynamic, textual memory, centered on causal abstractions (rather than general “helpful hints”), that is regularly updated after each trial so that the agent gradually learns useful knowledge for new trials. In the ScienceWorld benchmark, CLIN is able to continually improve on repeated trials on the same task and environment, outperforming state-of-the-art reflective language agents like Reflexion by 23 absolute points. CLIN can also transfer its learning to new environments (or new tasks), improving its zero-shot performance by 4 points (13 for new tasks) and can further improve performance there through continual memory updates, enhancing performance by an additional 17 points (7 for new tasks). This suggests a new architecture for agents built on frozen models that can still continually and rapidly improve over time.
Our goal is a system that will continually improve over time, both while attempting the same task in the same environment, and across different tasks and environments. Our approach builds on prior work on reflection in two ways: First, we conjecture that a specific style of insight will be useful, namely one that captures causal abstractions about agent’s actions, e.g., “opening doors may be necessary for movement between rooms”. Causal abstractions can potentially help the agent decide which action to take in the future, and can be viewed as a kind of action model learning (Arora et al., 2018), but placed in the modern context of language models. Second, we maintain these abstractions in a continually evolving, dynamic memory, which is regularly updated as the agent gains experience, allowing useful causal knowledge to persist (and unhelpful knowledge to be dropped) over time and between tasks and environments, as illustrated in Figure 1.
We operationalize and evaluate this approach in a memory-augmented language agent called CLIN (continual learning from interactions). CLIN is an agent that operates in ScienceWorld (Wang et al., 2022), a virtual, text-based environment in which an agent is tasked with science-based goals, e.g., boiling a liquid, growing a plant. We find that CLIN is able to rapidly learn about the environment and its action vocabulary and continually improve on repeated trials on the same task and environment, outperforming state-of-the-art (SOTA) reflective language agents like Reflexion by 23 points. CLIN can also transfer its learning to new environments (or tasks), improving its zero-shot performance by 4 (13 for new tasks) points and can further improve performance through continual memory updates, enhancing performance by an additional 17 (7 for new tasks) points. Our contributions are as follows:
• For memory-based language agents, we show that memory of causal abstractions is effective at helping the agents learn over an extended period and in varying conditions.
• We describe and evaluate CLIN, an architecture for a novel nonparametric learning paradigm. We show that CLIN learns faster than prior systems and generalizes better to new tasks and new environments, achieving state-of-the-art.
Overall, this work suggests that a dynamically maintained memory, centered around causal knowledge, is a promising way forward for agents built on frozen models to continually improve over time.