Does creating skills inside the agent loop eliminate mismatches?
Can coupling skill creation directly to the runtime reasoning loop—rather than authoring skills offline—close the gap between when skills are made and when they're used? This matters for whether agents can ground new capabilities in their actual situated context.
Most skill-creation approaches treat skills as isolated, static artifacts authored in a separate pass — generated offline, then handed to an agent that uses them in a different context. MUSE-Autoskill instead tightly couples creation to execution through a built-in skill_create tool invoked from within the runtime loop, so a skill is created on demand inside the same reasoning that needs it. The paper names the problem this solves: the creation-usage mismatch.
This matters because skills authored out-of-loop encode the author's assumptions about a task the agent has not yet faced, and the agent that later applies them lacks the situated context that motivated each step. When creation happens inside the loop, the skill is grounded in the exact trajectory, tools, and failure that prompted it — and the framework can immediately validate it through unit tests and runtime feedback rather than trusting a detached author. On SkillsBench, automatically generated in-loop skills reach 87.94% on their tasks and transfer to other agents with minimal accuracy loss.
The counterpoint is that in-loop creation risks proliferation — an agent that mints a skill for every situation accumulates redundant, narrow artifacts. MUSE addresses this with the rest of its lifecycle (memory, management, evaluation, refinement) that organizes and prunes, so creation alone is not the whole story. Therefore the durable insight is architectural: skills should be live infrastructure produced where they are consumed, not disposable outputs of a separate authoring stage — which is what makes them testable and transferable assets rather than one-off generations.
— "MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation", https://arxiv.org/abs/2605.27366
Related concepts in this collection
-
Can agents learn new skills without forgetting old ones?
Explores whether externalized skill libraries—storing learned behaviors as retrievable code rather than parameter updates—can solve the catastrophic forgetting problem that plagues continual learning systems.
Voyager builds the library by synthesis; MUSE specifies where in the loop creation happens and how the lifecycle prevents proliferation
-
Can skill documents be optimized like neural network weights?
Can natural-language skill documents be treated as trainable parameters and improved through iterative optimization with validation gating, similar to how model weights are tuned in deep learning?
complementary axis of self-evolving skills: MUSE fixes *where* skills are created (in-loop), SkillOpt fixes *how* they are refined (bounded text-space optimization)
-
Can language models learn skills without human supervision?
Can a three-role self-play system—Challenger, Reasoner, Judge—bootstrap natural-language skills from raw context alone, without human labels or external reward signals?
extends the in-loop principle: both manufacture skills from the agent's own situated experience rather than out-of-loop authoring, Ctx2Skill via self-play feedback, MUSE via runtime invocation
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
coupling skill creation to a tool invoked inside the runtime loop eliminates the creation-usage mismatch