Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Paper · arXiv 2501.10893 · Published January 18, 2025

We construct task instructions using LLMs for each sub-trajectory, a process called backward construction. The synthesized data are then filtered and used for both training and in-context learning, where we design agentic retrieval to retrieve demonstration examples based on information at each step, using both model-based and observation-based approaches.

However, given the low performance of LLMs in existing agentic benchmarks (Cao et al., 2024; Xie et al., 2024), it is likely that a large percentage of synthesized trajectories would not match with the instructions. To tackle this challenge, we construct new instructions by summarizing or abstracting each sub-trajectory, leveraging the strong summarization capabilities of LLMs (Liu et al., 2023; Pu et al., 2023). We call this process backward construction. After obtaining synthesized instruction-trajectory pairs and filtering low-quality ones, we apply it to both training and ICL, where we craft innovative retrieval pipelines optimized for agents. Specifically, the approach comprises two components: (1) a model-based approach where LLMs generate queries guided by instructions, interaction histories, and current observations, followed by retrieval models selecting demonstration examples from synthesized data; and (2) an observation-based approach that identifies examples in which the current observation appears in trajectories, signaling that the current state was encountered during the data synthesis process.