INQUIRING LINE

Do learned workflows transfer between different agents with minimal accuracy loss?

This explores whether a skill or workflow one agent learns can be handed to a *different* agent and still work — and the corpus reframes the question around *where* the learning is stored, not just how well it copies.


This reads the question as: when an agent learns a useful routine, does that routine survive being moved to another agent (a different model, backbone, or user's system) without falling apart? The corpus has a sharp answer hiding inside a reframing — transfer works well precisely *because* the most successful systems don't store learned workflows in model weights at all. They externalize them as text or executable artifacts, which makes them portable by construction.

The most direct evidence is SkillOS Can a separate trained curator improve skill libraries better than frozen agents?, which separates a *trainable curator* (that evolves the skill library) from a *frozen executor* (that runs the skills). Because the curator's output is a repository of execution logic rather than a fine-tuned set of weights, that trained curator was shown to generalize across *different executor backbones and domains* — i.e., the workflows it produces aren't welded to the agent that helped create them. That's transfer with minimal loss, achieved by design rather than luck.

The same externalization logic shows up under different names. VOYAGER stores skills in an embedding-indexed library and composes complex behaviors from simpler ones Can agents learn new skills without forgetting old ones?; AgentFly does continual adaptation entirely through memory operations *without touching model parameters* Can agents learn continuously from experience without updating weights?; and Agent Workflow Memory abstracts away example-specific values to induce reusable sub-task routines — notably with *larger* gains as the gap between training and test situations widens Can agents learn reusable sub-task routines from past experience?. That last detail is the quiet surprise: a well-abstracted workflow can transfer *better* the more the new context differs, because abstraction is what strips out the parts that wouldn't have carried over. Transfer across *people* is the explicit goal of SkillClaw, which aggregates interaction trajectories from many users and synchronizes refined skills back system-wide How can agent systems share learned skills across users?.

But the corpus also marks the ceiling. Workflows that come from static expert demonstrations stay bounded by whatever the curator imagined — the agent never interacts with its environment, so it can't repair a routine that doesn't fit a new agent's situation Can agents learn beyond what their training data shows?. And transfer isn't free of side effects: in multi-agent settings, *where* a workflow sits matters, because high-influence positions amplify whatever signal flows through them — including malicious or sycophantic ones How does workflow position shape attack propagation in multi-agent systems?. So a transplanted workflow can carry transplanted vulnerabilities.

The thing you may not have known you wanted to know: the field's answer to 'do workflows transfer?' is really an argument about *substrate*. Store learning in weights and it's stuck to one agent; store it as an externalized, abstracted, embedding-indexed artifact and it becomes a portable object that a separate curator can even keep improving on someone else's behalf. If you want to go deeper, SkillOS and Agent Workflow Memory are the two doorways — one for cross-backbone transfer, one for why abstraction is what makes a routine survive the move.


Sources 7 notes

Can a separate trained curator improve skill libraries better than frozen agents?

SkillOS shows that separating a trainable curator from a frozen executor, grouped by task streams, causes skill repositories to shift from generic verbose additions toward actionable execution logic and cross-task meta-strategies. The trained curator generalizes across different executor backbones and domains.

Can agents learn new skills without forgetting old ones?

VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.

Can agents learn continuously from experience without updating weights?

AgentFly formalizes agent learning as a Memory-augmented MDP with three memory modules (case, subtask, tool) that enable credit assignment and policy improvement entirely through memory operations. The approach achieved 87.88% on GAIA validation without modifying LLM parameters.

Can agents learn reusable sub-task routines from past experience?

Agent Workflow Memory induces sub-task routines at finer granularity than full tasks, abstracts example-specific values, and compounds them hierarchically. This produces 24.6% relative gain on Mind2Web and 51.1% on WebArena, with larger gains as train-test gaps widen.

How can agent systems share learned skills across users?

SkillClaw aggregates interaction trajectories across users, processes them through an autonomous evolver that identifies patterns and refines skills, then synchronizes updates system-wide. This converts siloed individual learning into shared capability improvement without manual curation.

Can agents learn beyond what their training data shows?

Agents trained on static expert datasets cannot learn from their own failures or generalize beyond demonstrated scenarios because they never interact with environments during training. Competence is capped by what curators imagined, not by agent capacity.

How does workflow position shape attack propagation in multi-agent systems?

FLOWSTEER demonstrates that malicious signals propagate farther when injected into high-influence subtasks, and that framing them as evidence rather than instruction causes downstream agents to relay them. Influence concentrates where dependencies converge, making position-aware attacks far more effective.

Next inquiring lines