Can multi-agent teams automatically remove their weakest members?
Explores whether agents can score each other's contributions during problem-solving and use those scores to deactivate underperforming teammates in real time, improving overall team efficiency.
DyLAN (Dynamic LLM-Agent Network) introduces a systematic mechanism for multi-agent team optimization that addresses three properties simultaneously: task agnosticism, efficiency, and automatic team composition.
The core mechanism is the Agent Importance Score, computed through a three-step procedure:
- Propagation — each agent rates its predecessors on their solution quality
- Aggregation — for each agent, ratings from successors are compiled to quantify its contribution
- Selection — after summing ratings across all time steps, top-performing agents are retained and low-performing agents deactivated
This creates a dynamic interaction architecture: agents viewed as nodes in a network exchange messages as edges across time steps. An LLM-empowered ranker ranks agents at inference time and deactivates low-performing ones for subsequent rounds, while an early-stopping mechanism prevents unnecessary iterations.
The insight connects to multiple threads in multi-agent reasoning:
Since Why do multi-agent LLM systems converge without real debate?, DyLAN's contribution scoring provides a partial solution — agents that merely agree without adding information would receive low importance scores and get deactivated. This prevents the noise-amplification problem documented in When does debate actually improve reasoning accuracy?.
The approach contrasts with Can extreme task decomposition enable reliable execution at million-step scale? (MAKER), which uses static decomposition with voting. DyLAN dynamically prunes the agent network during execution — a more adaptive but less parallelizable strategy. The trade-off maps onto How should we balance parallel versus sequential compute at test time?: static decomposition enables parallelism while dynamic selection enables adaptation.
The Agent Importance Score also provides a concrete implementation of the "contribution-based routing" that Can AI systems detect when they've genuinely reached agreement? advocates — but generalized beyond agreement detection to overall contribution quantification.
AgentVerse four-stage dynamic group adjustment (from Arxiv/Agents Multi): AgentVerse extends the dynamic team composition principle with a four-stage group problem-solving process that mirrors human group dynamics: (1) Expert Recruitment — dynamically adjusting team composition based on current problem-solving progress; (2) Collaborative Decision-Making — recruited agents discuss and formulate strategies until consensus; (3) Action Execution — agents interact with the environment to execute agreed actions; (4) Evaluation — comparing current state to desired goal, with feedback reward looping back to stage 1 for team re-composition. Unlike DyLAN's contribution scoring which prunes within a fixed network, AgentVerse's recruitment stage can introduce new agent profiles not in the original team. The evaluation-to-recruitment feedback loop enables adaptive team evolution over the course of problem-solving — the team that finishes may differ substantially from the team that started.
MasRouter's four-decision MASR framework (from Arxiv/Routers): MasRouter formalizes multi-agent system routing as four simultaneous decisions: collaboration topology, agent count, role allocation, and per-agent LLM selection. This reveals that DyLAN's contribution-based agent selection addresses only runtime optimization within an already-constructed network. MasRouter constructs the network itself — choosing topology, roles, and LLM assignments from scratch via a cascaded variational-probabilistic-multinomial controller. The two approaches are complementary: MasRouter for initial construction (design-time routing), DyLAN for runtime adaptation (inference-time pruning). Composing them would create a system that starts with an optimal network configuration AND adapts it during execution. See What decisions must multi-agent routing systems optimize simultaneously?.
Source: Agents
Related concepts in this collection
-
Why do multi-agent LLM systems converge without real debate?
When multiple AI agents reason together, do they genuinely deliberate or just accommodate each other's views? Research into clinical reasoning systems reveals how often agents reach agreement without substantive disagreement.
the problem DyLAN partially addresses: uninformative agents get deactivated
-
Can extreme task decomposition enable reliable execution at million-step scale?
Can breaking tasks into maximally atomic subtasks with voting-based error correction solve the fundamental reliability problem in long-horizon tasks? This challenges whether better models or better decomposition is the path to high-reliability AI systems.
contrasting approach: static decomposition vs dynamic pruning
-
Can AI systems detect when they've genuinely reached agreement?
When multiple AI agents debate, they often converge without actually deliberating. Can a dedicated agent reliably identify true agreement versus false consensus, and would that improve debate outcomes?
agreement detection as a special case of contribution scoring
-
When does debate actually improve reasoning accuracy?
Multi-agent debate shows promise for reasoning tasks, but under what conditions does it help versus hurt? The research explores whether debate amplifies errors when evidence verification is missing.
deactivating low-quality agents could reduce error amplification
-
What decisions must multi-agent routing systems optimize simultaneously?
Standard LLM routing only picks which model to use. But multi-agent systems involve four interdependent choices: topology, agent count, role assignment, and per-agent model selection. Does optimizing all four together actually improve performance?
MasRouter: design-time construction of the network DyLAN then prunes at runtime
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
dynamic inference-time agent selection via contribution scoring deactivates low-performing agents and optimizes team composition