SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Large language model (LLM) agents such as OpenClaw rely on reusable skills to perform complex tasks, yet these skills remain largely static after deployment. As a result, similar workflows, tool usage patterns, and failure modes are repeatedly rediscovered across users, preventing the system from improving with experience. While interactions from different users provide complementary signals about when a skill works or fails, existing systems lack a mechanism to convert such heterogeneous experiences into reliable skill updates. To address these issues, we present SkillClaw, a framework for collective skill evolution in multi-user agent ecosystems, which treats cross-user and over-time interactions as the primary signal for improving skills. SkillClaw continuously aggregates trajectories generated during use and processes them with an autonomous evolver, which identifies recurring behavioral patterns and translates them into updates to the skill set by refining existing skills or extending them with new capabilities. The resulting skills are maintained in a shared repository and synchronized across users, allowing improvements discovered in one context to propagate system-wide while requiring no additional effort from users. By integrating multi-user experience into ongoing skill updates, SkillClaw enables cross-user knowledge transfer and cumulative capability improvement, and experiments on WildClawBench show that limited interaction and feedback, it significantly improves the performance of Qwen3-Max in real-world agent scenarios.
This limitation becomes evident in everyday usage. For example, users often ask agents to complete multi-step tasks such as automating data processing workflows, where failures frequently arise from subtle issues such as incorrect argument formats or mismatched tool calls. Through several rounds of trial and error, an agent may eventually arrive at a working solution or even a more stable procedure. However, these improvements remain confined to the current session and are not consolidated into the skill set or carried forward to future interactions. As similar tasks recur across different users and over time, the same patterns of failure and recovery are repeatedly observed, yet the system does not improve its behavior. This is fundamentally problematic because users operate in overlapping task spaces where similar workflows, tools, and failure modes are shared, but the system fails to leverage these recurring experiences. Consequently, each user is forced to rediscover solutions independently, preventing knowledge from accumulating at the system level. Therefore, the key challenge is not only to improve performance within a single session, but also to enable knowledge to accumulate and evolve across users.
Existing approaches to agent adaptation fail to support the accumulation and evolution of skills across users and over time. Memory-based methods store past trajectories for retrieval (Shinn et al., 2023; Zhao et al., 2024; Fang et al., 2025a; Tang et al., 2025; Ouyang et al., 2025a; Chhikara et al., 2025; Liu et al., 2026), but such records remain tied to specific instances and are difficult to generalize into improved behavior. Skill-based methods compress experience into structured instructions (Xia et al., 2026a; Zhang et al., 2025a, 2026b; Wu et al., 2025; Zhang et al., 2026a), yet treat the resulting skill library as a static resource that does not evolve through usage. While local refinement can improve individual agent instances, these improvements remain isolated and do not accumulate across users, leading to fragmented skills rather than collective improvement over time. What is missing is a mechanism that turns ordinary interactions into continuous skill evolution and enables skills to improve collectively across users.
Building on this insight, we propose SkillClaw, a framework for skill collective evolution in multi-user OpenClaw-style agent ecosystems (Fig 1). SkillClaw adopts a centralized evolution architecture, where agents deployed across different users continuously generate interaction sessions during everyday usage. These trajectories are aggregated across users and over time as evidence of real-world task execution and are processed by a centralized evolution engine to drive skill updates. Given accumulated interaction trajectories, the evolver analyzes both successful and failed executions, identifies recurring issues and effective procedures, and updates the shared skill set by refining existing skills, creating new ones, or adjusting their descriptions. Unlike predefined pipelines, this evolution process is driven by an autonomous agent that performs openended reasoning over interaction evidence and directly edits skill definitions. The updated skills are then synchronized across agents, allowing improvements discovered in one context to propagate to future interactions across users and over time. This forms a continuous evolution loop in which interaction data drives skill updates, and updated skills improve subsequent interactions. From the user’s perspective, this process requires no additional effort, as data collection, evolution, and synchronization all occur automatically in the background.
This design introduces three key properties that distinguish SkillClaw from existing systems. First, Skill- Claw enables collective evolution, where knowledge from individual interactions contributes to a shared and continuously improving skill ecosystem. Second, it is fully automatic, with skill evolution driven by runtime interaction without manual curation or explicit user intervention. Third, it adopts an agentic evolution paradigm, where skill updates are produced through open-ended reasoning rather than predefined update rules, enabling flexible and context-aware improvements.