Memory in the Age of AI Agents: A Survey — Forms, Functions and Dynamics

Paper · arXiv 2512.13564
LLM MemoryLLM AgentsRetrieval-Augmented Generation (RAG)Context Engineering

Memory has emerged, and will continue to remain, a core capability of foundation model-based agents. It underpins long-horizon reasoning, continual adaptation, and effective interaction with complex environments. As research on agent memory rapidly expands and attracts unprecedented attention, the field has also become increasingly fragmented. Existing works that fall under the umbrella of agent memory often differ substantially in their motivations, implementations, assumptions, and evaluation protocols, while the proliferation of loosely defined memory terminologies has further obscured conceptual clarity. Traditional taxonomies such as long/short-term memory have proven insufficient to capture the diversity and dynamics of contemporary agent memory systems. We examine agent memory through the unified lenses of forms, functions, and dynamics. From the perspective of forms, we identify three dominant realizations of agent memory, namely token-level, parametric, and latent memory. From the perspective of functions, we move beyond coarse temporal categorizations and propose a finer-grained taxonomy that distinguishes factual, experiential, and working memory. From the perspective of dynamics, we analyze how memory is formed, evolved, and retrieved over time as agents interact with their environments.

Agent Memory Needs A New Taxonomy. The motivation for a new taxonomy and survey is twofold. Limitations of Existing Taxonomies: While several recent surveys have provided valuable and comprehensive overviews of agent memory, their taxonomies were developed prior to a number of rapid methodological advances and therefore do not fully reflect the current breadth and complexity of the research landscape. For example, emerging directions in 2025, such as memory frameworks that distill reusable tools from past experiences, or memory-augmented test-time scaling methods, remain underrepresented in earlier classification schemes. Conceptual Fragmentation: With the explosive growth of memory-related studies, the concept itself has become increasingly expansive and fragmented. Researchers often find that papers claiming to study "agent memory" differ drastically in implementation, objectives, and underlying assumptions. The proliferation of diverse terminologies (declarative, episodic, semantic, parametric memory, etc.) further obscures conceptual clarity, highlighting the urgent need for a coherent taxonomy that can unify these emerging concepts.

The dynamics of the memory system are characterized by three conceptual operators. Memory Formation: At time step t, the agent produces informational artifacts which may include tool outputs, reasoning traces, partial plans, self-evaluations, or environmental feedback. A formation operator selectively transforms these artifacts into memory candidates, extracting information with potential future utility rather than storing the entire interaction history verbatim. Memory Evolution: Formed memory candidates are integrated into the existing memory base through an evolution operator, which may consolidate redundant entries, resolve conflicts, discard low-utility information, or restructure memory for efficient retrieval. The resulting memory state persists across subsequent decision steps and tasks. Memory Retrieval: When selecting an action, the agent retrieves a context-dependent memory signal where R denotes a retrieval operator that constructs a task-aware query and returns relevant memory content. Short-term and long-term memory phenomena therefore emerge not from discrete architectural modules but from the temporal patterns with which formation, evolution, and retrieval are engaged.

At a high level, agent memory almost fully subsumes what has traditionally been referred to as LLM memory. Since 2023, many works describing themselves as "LLM memory mechanisms" are more appropriately interpreted, under contemporary terminology, as early instances of agent memory. Under a modern and more mature understanding of agency, such systems are naturally categorized as instances of agent memory. That said, the subsumption is not absolute. A distinct line of research genuinely concerns LLM-internal memory: managing the transformer's key–value (KV) cache, designing long-context processing mechanisms, or modifying model architectures to better retain information as sequence length grows. These works focus on intrinsic model dynamics and typically address tasks that do not require agentic behavior, and thus should be considered outside the scope of agent memory.