LLM Reasoning and Architecture Agentic and Multi-Agent Systems

Why do reasoning systems keep discovering new connections?

Explores whether agentic graph reasoning systems maintain a special balance between semantic diversity and structural organization that enables continuous discovery of novel conceptual relationships.

Note · 2026-02-23 · sourced from Knowledge Graphs

Analysis of iterative agentic graph reasoning models (Graph-PRefLexOR) reveals that as these systems autonomously expand knowledge graphs over hundreds of iterations, they evolve toward a self-organized critical state analogous to thermodynamic phase transitions. The key finding: semantic entropy (the diversity of meanings in the embedding space) persistently dominates structural entropy (the organization of graph connections), creating a stable "mildly negative" discovery parameter reminiscent of a free-energy minimum shifted toward disorder.

The structural-semantic dynamics decompose into three regimes:

Early phase: Strong positive correlation between node centrality and semantic diversity — central nodes rapidly integrate semantically distinct clusters
Critical transition (~iteration 400): Phase-transition-like behavior where structural-semantic correlation stabilizes
Post-critical: Mild stable positive correlation (~0.15) — structurally central nodes serve as persistent semantic bridges

A persistent ~12% of edges are "surprising" — structurally connected yet semantically distant — representing the system's ongoing capacity for novel conceptual connections. This partial decoupling between structural clusters and semantic similarity demonstrates that the knowledge graphs encode structural and semantic information through fundamentally distinct but complementary dimensions. The step-level decision-making here — which edges to explore — parallels When should retrieval actually help versus hurt reasoning?, where DeepRAG formalizes each reasoning step as a binary retrieve-or-use-parametric-knowledge decision. Both systems demonstrate that adaptive per-step knowledge acquisition outperforms uniform policies.

The insight for AI systems: the reason artificial reasoning systems remain continuously creative may be because they constantly explore a rich, diverse semantic space (high semantic entropy) while forming more ordered structural connections (lower structural entropy). The imbalance between available meanings and explicit structure fuels sustained discovery.

This connects to:

Does policy entropy collapse limit reasoning performance in RL? — the inverse dynamic: where RL training collapses entropy, agentic graph reasoning maintains it; the difference is what's being optimized (output distribution vs. knowledge structure)
Do reasoning cycles in hidden states reveal aha moments? — both analyze graph topology as predictor of reasoning quality; cyclicity as "aha moments" parallels surprising edges as continuous discovery
Can diversity optimization improve quality during language model training? — DARLING's semantic diversity optimization may work precisely because it maintains the semantic entropy dominance that enables discovery
When should retrieval actually help versus hurt reasoning? — both formalize reasoning over external knowledge as per-step optimization: ComoRAG decides which graph edges to explore, DeepRAG decides whether to retrieve; adaptive step-level decisions outperform uniform policies in both cases
Can reasoning systems maintain memory across multiple retrieval cycles? — both describe iterative reasoning that self-organizes toward comprehension: agentic graph reasoning maintains semantic entropy dominance for continuous discovery, while ComoRAG's PFC-inspired metacognitive loop evolves comprehension states through contradiction detection and resolution in a dynamic memory workspace

Original note title

Agentic graph reasoning self-organizes into a critical state where semantic entropy dominance over structural entropy fuels continuous discovery