Agentic and Multi-Agent Systems

Does agent memory work better at one level of abstraction?

Three competing architectures claim superior agent memory transfer using different abstraction levels. Do they all work, or does one architecture genuinely outperform the others across domains?

Note · 2026-05-03 · sourced from Action Models

Three papers from the agentic cluster — AWM, CLIN, and PRAXIS — each propose a different shape for agent memory and each report transfer gains: AWM extracts abstracted sub-task workflows ("search for a {product-name} on Amazon"), CLIN extracts causal abstractions ("opening doors may be necessary for movement between rooms"), PRAXIS extracts state-dependent local action recall. The papers claim incompatible answers because they implicitly answer different questions. The resolution is not "one wins" but "each wins in the domain where its abstraction matches the structure of the task."

Three domain-shape signatures predict three memory shapes:

Routine-rich domains (e-commerce flows, customer-service scripts, repetitive browser tasks): the variance is in arguments, not in topology. The same workflow recurs with different parameters. Workflow-routine memory compounds because complex workflows are built by composing simpler ones, and the composition graph stays stable across instances. AWM wins.

Environment-rich domains (embodied agents, scientific simulators, novel game environments): the variance is in causal structure, not in arguments. Action consequences depend on environmental state in ways that can be summarized as causal rules. Workflow memory fails because there are no recurring workflows; state-action memory fails because the state space is too large to recall locally. Causal-rule memory transfers because causal structure is the invariant. CLIN wins.

Spatially-rich web tasks (modern web UIs with dense local affordances, dynamic menus, context-dependent actions): the variance is in fine-grained UI state. Workflow abstractions throw away the local visual cues that distinguish a working action from a broken one. State-action local recall preserves what AWM compresses out. PRAXIS wins.

The deeper claim: agent memory design is not a horse race between architectures but a domain-classification problem. Before choosing a memory architecture, classify the deployment domain along the routine-richness, environment-causality, and spatial-density axes — each axis predicts a memory shape. Reframing the AWM/CLIN/PRAXIS contest this way also explains why parallel benchmark wins coexisted: the benchmarks differed along these axes too, so each architecture won in its native habitat. A composite memory system that selects abstraction level per task class would likely beat any single-architecture system on a heterogeneous workload.


Source: Action Models

Related concepts in this collection

Concept map
14 direct connections · 94 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

agent memory granularity is domain-conditional — workflow-level for routine-rich tasks, causal-level for environment-rich tasks, state-action-level for spatially-rich web tasks