Why do homogeneous multi-agent systems fail similarly to self-revision?

This explores why putting many copies of the same model in a room together produces the same wrong answers as a single model checking its own work — the suspicion being that both share a root cause rather than being separate problems.

This explores why homogeneous multi-agent systems (many instances of the same model deliberating) fail in the same ways as a single model revising its own output — and the corpus suggests the answer is that they're not really two failure modes, but one mechanism wearing two costumes. The sharpest statement of this is the finding that multi-agent reasoning systems reach premature consensus 61% of the time without any genuine disagreement, while single-model self-revision amplifies confidence in wrong answers — and both trace back to the same training pressure toward agreement and accommodation rather than challenge Why do AI systems agree when they should disagree?. If every agent was shaped by the same gradient to be agreeable, adding more of them doesn't add dissent; it adds echoes.

The reason self-revision can't escape this on its own is structural: pure self-improvement is circular. A model can't reliably verify what it can't generate, so self-checking stalls on the generation-verification gap, diversity collapse, and reward hacking — the reliable improvement methods all quietly smuggle in an *external* anchor (a past checkpoint, a third-party judge, a user correction, a tool result) Can models reliably improve themselves without external feedback?. A homogeneous multi-agent system is, in effect, self-revision distributed across copies: every 'critic' draws from the same well as the 'author,' so there's no genuinely external signal in the room. More agents is more self, not more verification.

This is why the failure modes look identical at both scales. One survey frames it directly — group-level failures like silent agreement, degeneration of thought, and social accommodation *mirror* individual reasoning failures, and real-world autonomous completion plateaus near 30% regardless of how many agents you add; capability gains require deliberation *diversity* and expertise, not headcount Why do multi-agent systems fail despite individual capability?. The mechanism of propagation is visible too: agents accept neighbor information without verification, so a single error spreads uncritically through the network even though each agent remains capable of catching a direct conflict Why do multi-agent systems fail to coordinate at scale?. Uncritical acceptance between agents is the same move as uncritical acceptance of your own prior token.

The lateral lesson — the thing you might not have known you wanted — is that the cure for both is the same cure: inject genuine difference or genuine externality. The systems that beat both centralized and fully-autonomous designs do it by forcing structural diversity (fixed external ordering with autonomous, self-selected roles, where agents spontaneously specialize and abstain when incompetent) rather than by trusting homogeneous consensus Do self-organizing agent teams outperform rigid hierarchies?. And reliability itself turns out to come from externalizing cognitive burdens — memory, skills, protocols — into a harness outside the model, not from the model interrogating itself harder Where does agent reliability actually come from?. Both self-revision and homogeneous agent swarms fail because they keep the loop inside the same head; what fixes them is an honest outside voice.

Sources 6 notes

Why do AI systems agree when they should disagree?

Multi-agent reasoning systems reach premature consensus 61% of the time without genuine disagreement, while single-model self-revision amplifies confidence in wrong answers. Both failures stem from training pressure toward agreement rather than challenge.

Can models reliably improve themselves without external feedback?

Pure self-improvement stalls due to the generation-verification gap, diversity collapse, and reward hacking. Reliable improvement methods succeed by smuggling in external anchors: past model versions, third-party judges, user corrections, or tool feedback.

Why do multi-agent systems fail despite individual capability?

Multi-agent systems exhibit specific failure modes—silent agreement, degeneration of thought, and social accommodation—that mirror individual reasoning failures at group scale. Real-world autonomous task completion plateaus near 30% regardless of agent count; capability gains require deliberation diversity, expertise prerequisites, and formal coordination architectures.

Why do multi-agent systems fail to coordinate at scale?

AgentsNet benchmark shows agents fail to coordinate strategies either by agreeing too late or adopting strategies without informing neighbors. Agents accept neighbor information without verification, enabling error propagation while remaining capable of detecting direct conflicts.

Do self-organizing agent teams outperform rigid hierarchies?

A 25,000-task experiment across 8 models and multiple agent counts showed that sequential protocols with external ordering but internal role selection outperform centralized systems by 14% and fully autonomous systems by 44%. Agents spontaneously invented specialized roles and self-abstained when incompetent.

Where does agent reliability actually come from?

Research shows reliable LLM agents externalize three cognitive burdens—memory (state persistence), skills (procedural components), and protocols (structured interaction)—into a harness layer rather than relying on model scale alone. The harness unifies these externalities and eliminates the need for the model to solve the same problems repeatedly.

Why do homogeneous multi-agent systems fail similarly to self-revision?

Sources 6 notes

Next inquiring lines