What decisions must multi-agent routing systems optimize simultaneously?

Standard LLM routing only picks which model to use. But multi-agent systems involve four interdependent choices: topology, agent count, role assignment, and per-agent model selection. Does optimizing all four together actually improve performance?

Note · 2026-02-23 · sourced from Routers

Standard LLM routing (RouteLLM, Hybrid-LLM) optimizes a single decision: which model handles this query. MasRouter argues this is an incomplete optimization for multi-agent systems, where routing involves four simultaneous decisions:

Collaboration mode determination — choosing the optimal communication topology (Chain, Tree, Graph) for varying task complexities
Dynamic agent number — determining how many expert agents are required based on input difficulty
Agent role allocation — selecting suitable roles per agent according to the query domain
Agent LLM routing — assigning each agent the appropriate LLM backbone

The formal definition of Multi-Agent System Routing (MASR) integrates all four into a unified framework. MasRouter implements this through a cascaded controller network: a variational latent variable model routes the query to a collaboration module, a structured probabilistic cascade generates agent roles progressively, and a multinomial distribution model recommends LLM backbones per agent. The cascade is sequential by design — topology constrains which roles make sense, and roles constrain which LLMs are appropriate.

The results validate the multi-dimensional approach: MasRouter surpasses RouterDC (SOTA single-model routing) by 3.51% average accuracy while reducing HumanEval cost from $0.363 to $0.185 (49% reduction). The framework generalizes to unseen LLM backbones and collaboration modes, and integrates with mainstream MAS for 17-28% cost reduction.

Since Can AI systems design unique multi-agent workflows per individual query?, MasRouter provides a more structured alternative — FlowReasoner generates system designs via RL-trained code generation (maximum flexibility, less interpretability), while MasRouter's topology→role→LLM cascade provides interpretable intermediate decisions at the cost of fixed structure types. Since Can multi-agent teams automatically remove their weakest members?, DyLAN prunes within a running network while MasRouter constructs the optimal network from scratch — complementary approaches that could be composed (MasRouter for initial construction, DyLAN for runtime adaptation).

The formalization matters because it surfaces what single-model routing leaves on the table. Since When does adding more agents actually help systems?, routing to the right topology per query is MasRouter's direct response to topology-dependent error amplification — rather than accepting a fixed topology's scaling limitations, route around them.

Source: Routers

Related concepts in this collection

Can AI systems design unique multi-agent workflows per individual query? Explores whether meta-agents trained with reinforcement learning can automatically generate personalized multi-agent system architectures tailored to individual user queries, rather than applying fixed task-level templates uniformly.
FlowReasoner: more flexible per-query design via RL code generation; MasRouter: more structured cascade
Can multi-agent teams automatically remove their weakest members? Explores whether agents can score each other's contributions during problem-solving and use those scores to deactivate underperforming teammates in real time, improving overall team efficiency.
runtime pruning complements construction-time routing
When does adding more agents actually help systems? Multi-agent systems often fail in practice, but the reasons remain unclear. This research investigates whether coordination overhead, task properties, or system architecture determine when agents improve or degrade performance.
MasRouter routes around topology limitations per query
Can routers select the right model before generation happens? Explores whether LLMs can be matched to queries by estimating difficulty upfront, before any generation begins. This matters because routing could cut costs significantly while preserving response quality.
single-model routing as the simplest case of MASR

Concept map

12 direct connections · 76 in 2-hop network ·medium cluster

What decisions must multi-agent routing systems … Can AI systems design unique multi-agent workflows… Can multi-agent teams automatically remove their w… When does adding more agents actually help systems… Can routers select the right model before generati…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

multi-agent system routing requires four simultaneous decisions — collaboration topology agent count role allocation and per-agent LLM selection