Can curator modules trained on one executor transfer to entirely different agent backbones?

This explores whether a separately-trained 'curator' — the module that builds and refines an agent's skill library — keeps working when you swap out the underlying model that actually executes the tasks, rather than being locked to the one it was trained against.

This explores whether the part of an agent system that *manages skills* (the curator) can be decoupled from the part that *does the work* (the executor) and still function when you bolt it onto a different model. The corpus has a direct answer, and it's encouraging: SkillOS trains a curator while keeping the executor frozen, and the trained curator generalizes across different executor backbones and domains Can a separate trained curator improve skill libraries better than frozen agents?. The reason it transfers is *what* it learns — instead of accumulating verbose, model-specific additions, the curator learns to evolve the repository toward actionable execution logic and cross-task meta-strategies. Those strategies are about the structure of the task, not the quirks of one model, which is exactly why they survive a backbone swap.

Why this is even possible becomes clearer when you look at what makes skills portable in the first place. VOYAGER stores skills as executable, embedding-indexed entries and composes complex ones from simpler ones Can agents learn new skills without forgetting old ones?, and Agent Workflow Memory abstracts away example-specific values to induce reusable sub-task routines — gaining more as the gap between training and test widens Can agents learn reusable sub-task routines from past experience?. The common thread: skills externalized as data (rather than baked into weights) are model-agnostic by construction. A curator that operates on that externalized layer is manipulating a representation that any sufficiently capable executor can read — so transfer is less a lucky property and more a design consequence.

There's an even sharper version of this idea: AgentFly treats learning as memory operations over a Memory-augmented MDP and improves the policy *without touching the model's parameters at all* Can agents learn continuously from experience without updating weights?. If adaptation lives entirely in memory, the executor becomes a swappable component almost by definition. The same logic shows up in the heterogeneous-architecture argument that small models suffice for most agentic subtasks — you'd want curation and routing to survive whatever mix of SLMs and LLMs you happen to deploy underneath Can small language models handle most agent tasks?.

But transfer isn't free, and the corpus flags two cautions worth knowing. First, optimal memory granularity is domain-conditional — workflow-level wins in routine-rich domains, causal-rule memory in environment-rich ones, state-action memory in web tasks Does agent memory work better at one level of abstraction?. A curator tuned for one domain's abstraction may transfer cleanly across *backbones* yet stumble across *domains*. Second, a curator can only pass on what it has imagined: agents bounded by static curated demonstrations never generalize past what the curator captured Can agents learn beyond what their training data shows?. So the real ceiling on transfer isn't the executor — it's the richness of what the curator has learned to encode.

The genuinely interesting horizon here is that this decoupling points toward shared, collective curation. SkillClaw aggregates interaction trajectories across many users and synchronizes refined skills system-wide How can agent systems share learned skills across users?, and MAJ-EVAL shows document-grounded personas transferring across tasks without redesign Can personas extracted from documents generalize across evaluation tasks?. If curators transfer across backbones, the curator stops being a per-model accessory and becomes a portable, accumulating asset — the durable part of the system, while the executor underneath is just the interchangeable engine of the moment.

Sources 9 notes

Can a separate trained curator improve skill libraries better than frozen agents?

SkillOS shows that separating a trainable curator from a frozen executor, grouped by task streams, causes skill repositories to shift from generic verbose additions toward actionable execution logic and cross-task meta-strategies. The trained curator generalizes across different executor backbones and domains.

Can agents learn new skills without forgetting old ones?

VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.

Can agents learn reusable sub-task routines from past experience?

Agent Workflow Memory induces sub-task routines at finer granularity than full tasks, abstracts example-specific values, and compounds them hierarchically. This produces 24.6% relative gain on Mind2Web and 51.1% on WebArena, with larger gains as train-test gaps widen.

Can agents learn continuously from experience without updating weights?

AgentFly formalizes agent learning as a Memory-augmented MDP with three memory modules (case, subtask, tool) that enable credit assignment and policy improvement entirely through memory operations. The approach achieved 87.88% on GAIA validation without modifying LLM parameters.

Can small language models handle most agent tasks?

SLMs handle the repetitive, well-defined language tasks that constitute most agent work at 10–30× lower cost than LLMs, making heterogeneous architectures (SLMs by default, LLMs selective) the economically rational design pattern.

Does agent memory work better at one level of abstraction?

Workflow-level memory wins in routine-rich domains, causal-rule memory in environment-rich domains, and state-action memory in spatially-rich web tasks. The optimal abstraction depends on whether task variance comes from arguments, causal structure, or fine-grained UI state.

Can agents learn beyond what their training data shows?

Agents trained on static expert datasets cannot learn from their own failures or generalize beyond demonstrated scenarios because they never interact with environments during training. Competence is capped by what curators imagined, not by agent capacity.

How can agent systems share learned skills across users?

SkillClaw aggregates interaction trajectories across users, processes them through an autonomous evolver that identifies patterns and refines skills, then synchronizes updates system-wide. This converts siloed individual learning into shared capability improvement without manual curation.

Can personas extracted from documents generalize across evaluation tasks?

MAJ-EVAL automatically extracts stakeholder personas from domain documents via semantic clustering and orchestrates structured three-phase debate, achieving reproducible evaluation that transfers across tasks like summarization and dialogue without manual redesign. The approach grounds personas in real stakeholder perspectives rather than arbitrary roles.

Can curator modules trained on one executor transfer to entirely different agent backbones?

Sources 9 notes

Next inquiring lines