Agentic Systems and Planning

What makes agent-created code artifacts so hard to manage?

Agent-authored code that persists and is shared across systems raises difficult questions about what should be kept versus discarded, and how to maintain consistent state when multiple agents collaborate on the same artifacts.

Note · 2026-05-28 · sourced from Tool Computer Use

Among the three elements of agentic code — model capability, harness infrastructure, and agent-initiated artifacts — the survey flags the third as the one that "remains relatively underexplored." Agent-initiated code artifacts are the interactive objects an agent creates, executes, observes, revises, persists, and shares during a task: patches and tests authored over a live repository, interface commands synthesized against DOM trees, hypothesis-testing pipelines composed on the fly, executable policies and skill libraries revised in response to environment feedback. These appear across coding assistance, GUI/OS automation, scientific discovery, and embodied control — yet they sit outside the well-mapped territory of predefined infrastructure.

The open questions cluster around persistence and sharing. When an agent writes code that outlives the current step, what should persist and what should be discarded? When multiple agents share artifacts, how is consistent state maintained, and how is a useful artifact promoted from one-off scratch work to durable, reviewable infrastructure? The survey's listed open challenges — evaluation beyond final task success, verification under incomplete feedback, regression-free harness improvement, consistent shared state across agents, human oversight for safety-critical actions — converge on exactly this layer. The counterpoint is that some agent-authored code is genuinely disposable and over-engineering its lifecycle wastes effort. But this matters because the artifacts an agent creates may be where the next gains in autonomy and coordination live, and they are precisely what current harness engineering least understands.


— "Code as Agent Harness: Toward Executable, Verifiable, and Stateful Agent Systems", https://arxiv.org/abs/2605.18747

Related concepts in this collection

Concept map
14 direct connections · 90 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

agent-initiated code artifacts that persist and are shared are the underexplored frontier of harness engineering