Agentic and Multi-Agent Systems LLM Reasoning and Architecture

Can frozen language models learn without updating their parameters?

If agents built on frozen models can't change their weights, what kind of memory structure would let them keep improving across trials and transfer to new tasks? This challenges assumptions about how continual learning must work.

Note · 2026-05-03 · sourced from Action Models

CLIN argues that the bottleneck for continual learning in language agents is not parameter updates but the structure of what gets remembered. Reflexion-style agents (see Can agents learn from failure without updating their weights?) maintain "helpful hints" — generic verbal reflections that work for the immediate trial but transfer poorly across tasks and environments. CLIN's wager is that a specific style of memory — causal abstractions of the form "opening doors may be necessary for movement between rooms" — produces durable, transferable knowledge because causal structure is what predicts which action to take next.

Empirically the wager pays off. On ScienceWorld, CLIN beats SOTA reflective agents like Reflexion by 23 absolute points on repeated trials. More importantly it transfers: zero-shot performance on new environments improves by 4 points (13 for new tasks), and continued memory updates in the new setting add another 17 points (7 for new tasks). The causal-abstraction memory is therefore not just a within-task accelerator but a substrate for cross-environment generalization.

The conceptual move is to position language-model agents as a modern instantiation of action model learning — but with the action model written in natural language and continually edited rather than learned as parameters. Useful causal knowledge persists across trials, unhelpful causal knowledge is dropped. This suggests a new architectural pattern: agents built on frozen models can still continually and rapidly improve over time if the memory representation is the right shape. The shape that matters is causal, not encyclopedic — a position that pairs interestingly with Can agents learn reusable sub-task routines from past experience? (workflow-shaped memory) and Does state-indexed memory outperform high-level workflow memory for web agents? (state-action-shaped memory). The three notes target the same problem (what shape should agent memory take?) and disagree on the answer.


Source: Action Models

Related concepts in this collection

Concept map
12 direct connections · 93 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

causal abstractions in dynamic textual memory let frozen-model agents continually improve — outperforming Reflexion by 23 points without parameter updates