Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Paper · arXiv 2402.01680 · Published January 21, 2024

an in-depth discussion on the essential aspects of multi-agent systems based on LLMs, as well as the challenges. Our goal is for readers to gain substantial insights on the following questions: What domains and environments do LLM-based multi-agents simulate? How are these agents profiled and how do they communicate? What mechanisms contribute to the growth of agents’ capacities?

Based on the inspiring capabilities of the single LLMbased agent, LLM-based Multi-Agents have been proposed to leverage the collective intelligence and specialized profiles and skills of multiple agents. Compared to systems using a single LLM-powered agent, multi-agent systems offer advanced capabilities by 1) specializing LLMs into various distinct agents, each with different capabilities, and 2) enabling interactions among these diverse agents to simulate complex real-world environments effectively. In this context, multiple autonomous agents collaboratively engage in planning, discussions, and decision-making, mirroring the cooperative nature of human group work in problem-solving tasks. This approach capitalizes on the communicative capabilities of LLMs, leveraging their ability to generate text for communication and respond to textual inputs. Furthermore, it exploits LLMs’ extensive knowledge across various domains and their latent potential to specialize in specific tasks. Recent research has demonstrated promising results in utilizing LLM-based multi-agents for solving various tasks, such as software development [Hong et al., 2023; Qian et al., 2023], multi-robot systems [Mandi et al., 2023; Zhang et al., 2023c], society simulation [Park et al., 2023; Park et al., 2022], policy simulation [Xiao et al., 2023; Hua et al., 2023], and game simulation [Xu et al., 2023c; Wang et al., 2023c]. Due to the nature of interdisciplinary study in this field, it has attracted a diverse range of researchers, expanding beyond AI experts to include those from social science, psychology, and policy research.

We delve into this question by discussing: 1) the agents-environment interface, which details how agents interact with the task environment; 2) agent profiling, which explains how an agent is characterized by an LLM to behave in specific ways; 3) agent communication, which examines how agents exchange messages and collaborate; and 4) agent capability acquisition, which explores how agents develop their abilities to effectively solve problems. An additional perspective for reviewing studies about LLM-MA is their application.

Decision-making Thought: This term denotes the capability of LLM-based agents, guided by prompts, to break down complex tasks into smaller subgoals [Khot et al., 2023], think through each part methodically (sometimes exploring multiple paths) [Yao et al., 2023], and learn from past experiences [Shinn et al., 2023] to perform better decision-making on complex tasks. This capability enhances the autonomy of a single LLM-based agent and bolsters its effectiveness in problem-solving.

Tool-use: LLM-based agents’ tool-use capability allows them to leverage external tools and resources to accomplish tasks, enhancing their functional capabilities and operate more effectively in diverse and dynamic environments [Li et al., 2023d; Ruan et al., 2023; Gao et al., 2023b].

Memory: This ability refers to the capability of LLMbased agent for conducting in-context learning [Dong et al., 2023a] as short memory or external vector database [Lewis et al., 2021] as long memory to preserve and retrieve information over prolonged periods [Wang et al., 2023b]. This ability enables a single LLM-based agent to maintain contextual coherence and enhance learning from interactions.

LLM-MA systems emphasize diverse agent profiles, inter-agent interactions, and collective decision-making processes. From this perspective, more dynamic and complex tasks can be tackled by the collaboration of multiple autonomous agents, each of which is equipped with unique strategies and behaviors, and engaged in communication with one another.

For instance, in gaming environments, agents might be profiled as players with varying roles and skills, each contributing differently to the game’s objectives. In software development, agents could take on the roles of product managers and engineers, each with responsibilities and expertise that guide the development process. Similarly, in a debating platform, agents might be designated as proponents, opponents, or judges, each with unique functions and strategies to fulfill their roles effectively. These profiles are crucial for defining the agents’ interactions and effectiveness within their respective environments. Table 1 lists the agent Profiles in recent LLM-MA works.

Regarding the Agent Profiling Methods, we categorized them into three types: Pre-defined, Model-Generated, and Data-Derived. In the Pre-defined cases, agent profiles are explicitly defined by the system designers. The Model- Generated method creates agent profiles by models, e.g., large language models. The Data-Derived method involves constructing agent profiles based on pre-existing datasets.

3.3 Agents Communication The communication between agents in LLM-MA systems is the critical infrastructure supporting collective intelligence. We dissect agent communication from three perspectives: 1) Communication Paradigms: the styles and methods of interaction between agents; 2) Communication Structure: the organization and architecture of communication networks within the multi-agent system; and 3) Communication Content exchanged between agents.

Communication Paradigms: Current LLM-MA systems mainly take three paradigms for communication: Cooperative, Debate, and Competitive. Cooperative agents work together towards a shared goal or objectives, typically exchanging information to enhance a collective solution. The Debate paradigm is employed when agents engage in argumentative interactions, presenting and defending their own viewpoints or solutions, and critiquing those of others. This paradigm is ideal for reaching a consensus or a more refined solution. Competitive agents work towards their own goals that might be in conflict with the goals of other agents.

Communication Structure: Fig. 3 shows four typical communication structures in LLM-MA systems. Layered communication is structured hierarchically, with agents at each level having distinct roles and primarily interacting within their layer or with adjacent layers. [Liu et al., 2023] introduces a framework called Dynamic LLM-Agent Network (DyLAN), which organizes agents in a multi-layered feed-forward network. This setup facilitates dynamic interactions, incorporating features like inference-time agent selection and an early-stopping mechanism, which collectively enhance the efficiency of cooperation among agents. Decentralized communication operates on a peer-to-peer net- Figure 3: The Agent Communication Structure. work, where agents directly communicate with each other, a structure commonly employed in world simulation applications. Centralized communication involves a central agent or a group of central agents coordinating the system’s communication, with other agents primarily interacting through this central node. Shared Message Pool is proposed by MetaGPT [Hong et al., 2023] to improve the communication efficiency.

Communication Content: In LLM-MA systems, the Communication Content typically takes the form of text. The specific content varies widely and depends on the particular application. For example, in software development, agents may communicate with each other about code segments.

3.4 Agents Capabilities Acquisition The Agents Capabilities Acquisition is a crucial process in LLM-MA, enabling agents to learn and evolve dynamically. In this context, there are two fundamental concepts: the types of feedback from which agents should learn to enhance their capabilities, and the strategies for agents to adjust themselves to effectively solve complex problems.

Feedback from Agents Interactions means that the feedback comes from the judgement of other agents or from agents communications. It is common in problem-solving scenarios like science debates, where agents learn to critically evaluate and refine the conclusions through communications. In world simulation scenarios such as Game Simulation, agents learn to refine strategies based on previous interactions between other agents. 3) Human Feedback comes directly from humans and is crucial for aligning the multi-agent system with human values and preferences. This kind of feedback is widely used in most “Human-in-the-loop” applications

decide subsequent actions as seen in Memory-based solutions, agents can dynamically self-evolve by modifying themselves such as altering their initial goals and planning strategies, and training themselves based on feedback or communication logs. [Nascimento et al., 2023] proposes a self-control loop process to allow each agent in the multi-agents systems to be self-managed and self-adaptive to dynamic environments, thereby improving the cooperation efficiency of multiple agents. [Zhang et al., 2023b] introduces ProAgent which anticipates teammates’ decisions and dynamically adjusts each agent’s strategies based on the communication logs between agents, facilitating mutual understanding and improving collaborative planning capability.

Self-Evolution enables agents’ autonomous adjustment in their profiles or goals, rather than just learning from historical interactions. 3) Dynamic Generation. In some scenarios, the system can generate new agents on-the-fly during its operation

https://www.crewai.com/

https://newsletter.victordibia.com/p/multi-agent-llm-applications-a-review

Jeremiah Owyang

I would change management to architecture; managed agents is one type, but there’s several (spoke hub etc) architectures

Also I’d add in reasoning, KG somewhere – agents for domain-specific reasoning tasks, actions