LLM Reasoning and Architecture Reinforcement Learning for LLMs

Can cognition work by reusing memory instead of recomputing?

Does intelligence emerge from structured navigation of prior inference paths rather than fresh computation? This challenges whether brains and AI systems need to recalculate constantly or can leverage stored trajectories for efficiency.

Note · 2026-02-23 · sourced from Novel Architectures

Memory-Amortized Inference (MAI) proposes that intelligence is fundamentally non-ergodic: it emerges from structured reuse of prior inference trajectories, not from uniform sampling or optimization from scratch. This is a sharp departure from standard computational models where each inference begins fresh.

The core framework: cognition is modeled as inference over latent cycles in memory. Memory trajectories define topologically stable, entropy-minimizing paths through representational space. The system navigates over constrained latent manifolds guided by persistent topological memory — enabling context-aware, structure-preserving inference with dramatically reduced computational cost.

The most provocative claim is the time-reversal duality between MAI and RL: whereas RL propagates value forward from reward (bootstrapping over futures), MAI reconstructs latent causes backward from memory structures (inferring the past from its traces). Both rely on partial, structure-aware updates to minimize uncertainty. This duality allows MAI to invert RL's reward-driven flow, replacing energy-intensive iteration with structure-aware reuse.

Practical implications:

This is highly theoretical and requires empirical validation. But the conceptual contribution is clear: if RL is "learning what to do from future rewards," MAI is "learning what happened from past memories." The suggestion that these are formally dual operations connecting cognition to decision-making opens a new theoretical perspective on the relationship between memory and agency.


Source: Novel Architectures

Related concepts in this collection

Concept map
14 direct connections · 134 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

memory-amortized inference models cognition as navigation over constrained latent manifolds — the time-reversal dual of reinforcement learning