LLM Reasoning and Architecture Reinforcement Learning for LLMs

Can cognition work by reusing memory instead of recomputing?

Does intelligence emerge from structured navigation of prior inference paths rather than fresh computation? This challenges whether brains and AI systems need to recalculate constantly or can leverage stored trajectories for efficiency.

Note · 2026-02-23 · sourced from Novel Architectures

Memory-Amortized Inference (MAI) proposes that intelligence is fundamentally non-ergodic: it emerges from structured reuse of prior inference trajectories, not from uniform sampling or optimization from scratch. This is a sharp departure from standard computational models where each inference begins fresh.

The core framework: cognition is modeled as inference over latent cycles in memory. Memory trajectories define topologically stable, entropy-minimizing paths through representational space. The system navigates over constrained latent manifolds guided by persistent topological memory — enabling context-aware, structure-preserving inference with dramatically reduced computational cost.

The most provocative claim is the time-reversal duality between MAI and RL: whereas RL propagates value forward from reward (bootstrapping over futures), MAI reconstructs latent causes backward from memory structures (inferring the past from its traces). Both rely on partial, structure-aware updates to minimize uncertainty. This duality allows MAI to invert RL's reward-driven flow, replacing energy-intensive iteration with structure-aware reuse.

Practical implications:

Energy efficiency: MAI addresses the computational bottleneck of modern AI by replacing brute-force recomputation with memory-based navigation
Biological plausibility: Models each cortical column as a local inference operator over cycle-consistent memory states, providing a principled account of Mountcastle's Universal Cortical Algorithm
Delta-homology: The mathematical framework uses topological tools to characterize the stability of memory cycles — inference is stable when memory trajectories form homologically persistent structures

This is highly theoretical and requires empirical validation. But the conceptual contribution is clear: if RL is "learning what to do from future rewards," MAI is "learning what happened from past memories." The suggestion that these are formally dual operations connecting cognition to decision-making opens a new theoretical perspective on the relationship between memory and agency.

Source: Novel Architectures

Related concepts in this collection

When should AI systems do their thinking? Most AI inference happens when users ask questions, but what if models could think during idle time instead? This explores whether shifting inference to before queries arrive could fundamentally change system design.
MAI deepens this: precomputation is memory-amortized inference, where thinking happens by navigating prior trajectories rather than computing anew
Can agents learn from failure without updating their weights? Explores whether language models can improve through trial-and-error by storing reflections in memory rather than through gradient-based parameter updates. Tests if environmental feedback alone can drive learning.
MAI provides a formal framework for why episodic memory enables learning: reuse of inference trajectories
Can semantic knowledge shift model behavior like reinforcement learning does? Can textual descriptions of successful reasoning patterns, prepended as context, achieve the same distribution shifts that RL achieves through parameter updates? This matters because it could eliminate the need for expensive fine-tuning on limited data.
compatible: experience-as-prior is a form of memory-amortized inference

Concept map

14 direct connections · 134 in 2-hop network ·dense cluster

Can cognition work by reusing memory instead of … When should AI systems do their thinking? Can agents learn from failure without updating the… Can semantic knowledge shift model behavior like r…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

memory-amortized inference models cognition as navigation over constrained latent manifolds — the time-reversal dual of reinforcement learning