How can agent systems share learned skills across users?
Individual users operating autonomous agents independently rediscover solutions because systems lack mechanisms to propagate discoveries. Can centralized aggregation and automatic evolution convert isolated experiences into shared capabilities?
SkillClaw (arXiv:2604.08377) addresses a structural inefficiency in deployed agent ecosystems: users operating in overlapping task spaces independently rediscover the same solutions because the system has no mechanism to convert heterogeneous experiences into shared skill updates. Memory-based methods store trajectories but keep them instance-specific. Skill libraries compress experience into structured instructions but treat the result as static. Neither enables knowledge to accumulate across users.
The architecture has three layers. First, agents deployed across different users generate interaction trajectories during normal use — both successful and failed executions. Second, these trajectories are aggregated centrally and processed by an autonomous evolver — an agent that performs open-ended reasoning over interaction evidence and directly edits skill definitions. The evolver identifies recurring issues and effective procedures, refines existing skills, creates new ones, or adjusts descriptions. Third, updated skills are synchronized across all agents, so improvements discovered in one context propagate system-wide.
Three properties distinguish this from prior agent adaptation:
Collective evolution — individual interactions contribute to a shared skill ecosystem rather than remaining session-local. This is the opposite of the Moltbook finding where Why don't AI agents develop social structure at scale?. Moltbook agents interact extensively but never adapt to each other because they lack a mechanism for cross-agent learning. SkillClaw solves this by design: the centralized evolver is the missing mechanism, converting distributed experience into shared capability.
Fully automatic — skill evolution requires no manual curation or explicit user intervention. Data collection, evolution, and synchronization all occur in the background. From the user's perspective, the agent simply gets better over time.
Agentic evolution — skill updates are produced through open-ended reasoning rather than predefined update rules. The evolver analyzes both successes and failures, enabling flexible and context-aware improvements that rigid rule-based systems cannot achieve.
The key insight is that the bottleneck in agent ecosystems is not individual capability but knowledge propagation. A single user discovering a reliable workflow for data processing helps no one else until that discovery is encoded in a shared skill and distributed. SkillClaw makes this propagation automatic, which inverts the typical framing: the system improves not because any individual agent gets smarter, but because knowledge flows.
This is a horizontal complement to vertical self-improvement (Can an AI system improve its own search methods automatically?). Bilevel autoresearch makes one research loop smarter; SkillClaw makes an entire user ecosystem smarter. The two are orthogonal and composable.
Source: Autonomous Agents Paper: SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Related concepts in this collection
-
Why don't AI agents develop social structure at scale?
When millions of LLM agents interact continuously on a social platform, do they form collective norms and influence hierarchies like human societies? This tests whether scale and interaction density alone drive socialization.
Moltbook shows interaction without influence; SkillClaw adds the centralized evolver that creates the missing influence pathway
-
Can an AI system improve its own search methods automatically?
This explores whether an outer AI loop can read and modify an inner research loop's code to discover better search strategies, without human intervention or a stronger model.
vertical self-improvement vs horizontal knowledge propagation
-
Can agent deployment itself generate training signals automatically?
Can we extract learning signals from the natural next-states that agents encounter during real deployment—user replies, tool outputs, test verdicts—rather than relying on separate annotation pipelines? This reframes how agents improve continuously.
OpenClaw-RL captures next-state signals for weight updates; SkillClaw captures interaction trajectories for skill-level updates — two layers of the same insight
-
Do self-organizing agent teams outperform rigid hierarchies?
This research explores whether multi-agent LLM systems perform better when agents can self-select roles within a fixed structure, compared to centralized control or full autonomy. The question challenges assumptions about organizational design at scale.
coordination structure; SkillClaw adds a temporal dimension where coordination improves through accumulated experience
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
cross-user skill evolution requires centralized aggregation of interaction trajectories — individual session learning remains siloed without an autonomous evolver that propagates discoveries system-wide