Can LLM agent groups reliably reach consensus together?
Tests whether multi-agent LLM systems can achieve valid agreement in Byzantine consensus games, even under benign conditions with no conflicting preferences over outcomes.
Multi-agent LLM systems are increasingly deployed in contexts that require consensus: agreeing on a delegated task, validating a shared decision, converging on a planned action sequence. The question is whether they can actually reach agreement reliably when challenged.
"Can AI Agents Agree?" (2603.01213) tests this with a Byzantine consensus game over scalar values using synchronous all-to-all simulation. The setup deliberately strips out value-optimization concerns: in a no-stake setting, agents have no preferences over the final value, so the evaluation focuses purely on agreement reachability rather than on what gets agreed to. The simplest possible test of consensus capability.
The finding is uncomfortable for current MAS deployments: valid agreement is not reliable even in benign settings without Byzantine agents, and degrades monotonically as group size grows. Introducing even a small number of Byzantine agents further reduces success. Across hundreds of simulations spanning model sizes, group sizes, and Byzantine fractions, the LLM-agent groups frequently fail to reach valid consensus within the round limit.
The mechanism is the key insight. Failures are dominated by liveness loss — timeouts and stalled convergence — rather than by subtle value corruption. The agents don't get tricked into the wrong answer; they get stuck not converging on any answer at all. This contrasts with the standard intuition that Byzantine fault tolerance is primarily about defending against adversarial value injection. For LLM agents, the harder problem is reaching agreement at all, before even worrying about whether the agreement is the right one.
The structural diagnosis: current LLM agents lack the protocol discipline that distributed systems achieve through deterministic state machines. Each agent generates stochastic responses, can drift off-topic, can fail to recognize when consensus has been reached, can introduce procedural confusion that prevents the round-limit from terminating productively. Liveness — the property that the system eventually decides something — is harder than safety (the property that what it decides is correct) when the agents themselves are stochastic.
This connects to Why do multi-agent LLM systems converge without genuine deliberation? from a different angle. Silent agreement is convergence-too-early on a wrong answer; this paper documents the opposite failure mode — failure-to-converge-at-all. The two together bracket the consensus failure space: when MAS systems try to reach agreement, they either (a) prematurely silently agree without genuine deliberation, or (b) fail to converge through liveness loss. Neither is reliable.
The implication for deployment is stark: agreement is not yet a dependable emergent capability of current LLM-agent groups. Systems that rely on multi-agent consensus for cooperation, delegation, or safety-critical coordination are building on a fragile foundation. The dominant question for production MAS becomes architectural — how to introduce protocol structure that does NOT rely on agents themselves recognizing convergence — rather than purely behavioral or training-based.
Paper: Can AI Agents Agree?
Related concepts in this collection
-
Why do multi-agent LLM systems converge without genuine deliberation?
Multi-agent reasoning systems are designed to improve answers through debate, but often agents simply agree with early confident claims rather than genuinely disagreeing. What drives this pattern and how common is it?
the opposite failure mode in the consensus space: this paper documents failure-to-converge; silent-agreement documents premature-convergence; together they bracket the unreliable-consensus problem
-
Why do multi-agent systems fail to coordinate at scale?
Explores how LLM agents struggle to synchronize strategy timing and validate information when coordinating across larger networks, revealing fundamental limits in distributed reasoning.
AgentsNet shows scale-dependent coordination failure on COLORING; this paper shows scale-dependent consensus failure on scalar values; same scaling pattern in different task class
-
Why do multi-agent LLM systems fail more than expected?
This research asks what specific failure modes cause multi-agent systems to underperform despite their promise. Understanding these failure patterns is essential for building more reliable collaborative AI systems.
MAST taxonomy includes coordination failures; this paper isolates one specific mode (Byzantine liveness loss) for systematic measurement
-
Why do autonomous LLM agents fail in predictable ways?
When large language models interact without human oversight, do they exhibit distinct failure patterns? Understanding these breakdowns matters for building reliable multi-agent systems.
infinite loops in CAMEL are the same dynamic as the liveness loss documented here: stochastic agents fail to recognize when to stop
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
LLM-agent Byzantine consensus fails primarily through liveness loss not value corruption — agreement is fragile even in benign no-stake settings and degrades with group size