AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Furthermore, planning ability is a crucial component of an LLM-based agent, involving interaction with the environment and executing actions to complete a planning task, which generally entails achieving a desired goal from an initial state. This paper investigates enhancing the planning abilities of LLM-based agents through instruction tuning, referred to as agent training. Recent studies on agent training have demonstrated that utilizing expert-level trajectory data (sequences of action-observation pairs) for instruction-tuning LLMs effectively enhances their planning capabilities. However, existing work primarily focuses on synthesizing trajectories from manually designed planning tasks and environments. The labor-intensive nature of creating these environments and tasks impedes the generation of sufficiently varied and extensive trajectories for agent training.
we introduce a framework, AGENTGEN, that leverages LLMs first to generate environments and subsequently generate planning tasks conditioned on these environments. Specifically, to improve environmental diversity, we propose using an inspiration corpus composed of various domain-specific text segments as the context for synthesizing environments.
to increase the difficulty diversity of generated planning tasks, we propose a bidirectional evolution method, BI-EVOL, that evolves planning tasks from easier and harder directions to synthesize a task set with a smoother difficulty curve, thereby enhancing the learning process of LLMs more effectively. These methods collectively contribute to the generation of diverse trajectory data for instruction-tuning.
utilizes LLMs to construct diverse environments and planning tasks for agent training, expanding the available environments from a few to hundreds. More specifically, AGENTGEN is structured around two stages: (1) Environment Generation: Achieving sufficient environmental diversity is essential for creating diverse planning tasks, which involves covering a broad range of scenarios and domains. To ensure this, we use an inspiration corpus composed of diverse text segments as context for generating environment specifications with LLMs, where actions, restrictions, and other details are defined using natural language
Subsequently, we prompt the LLM to produce the corresponding code based on this specification, which may be composed of Python, Planning Domain Definition Language (PDDL) [36], or other domain-specific languages.
we synthesized environments and planning tasks based on PDDL [36] and constructed a dataset comprising 592 environments, each with 20 tasks.
For example, in a PDDL-based planning problem, the domain PDDL file can be regarded as the environment E, defining states (predicates) and actions and specifying the transition function using preconditions and effects of each action. The problem PDDL file, on the other hand, can be seen as the task T. Both initial states and goal conditions are typically defined as combinations of predicates. Another widely used programming language for constructing planning problems is Python. For example, in OpenAI gym1, a planning problem will be implemented as a Python class, where the transition function is implemented as a method of the class, usually named the "step" or "update" function.
we propose a sophisticated framework for environment generation structured around three main components: (1) an environment specification generation module where an LLM first generates a specification of the environment, typically including a general overview of the environment, descriptions of the state space and action space, and definitions of the transition functions; (2) an environment implementation module that generates corresponding code based on the environment specification; and (3) an environment library that stores previously generated high-quality environments, serving as a comprehensive environment dataset and providing in-context examples for generating new environments.