TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation
agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system.
Our benchmark focuses on agent-assisted travel planning. To successfully accomplish the task, agents are required to leverage various computer tools, including the database and the code interpreter. We also provide a simulator to mimic dynamic real-world scenarios, encompassing the entire pipeline of travel planning—from ticket booking to route/time planning. With the simulator, we are able to assess agents’ partial task completion and deliver a nuanced score.