Can agents learn from failure without updating their weights?
Explores whether language models can improve through trial-and-error by storing reflections in memory rather than through gradient-based parameter updates. Tests if environmental feedback alone can drive learning.