How does co-activation shape which memories become linked together?

This explores the Hebbian idea that things activated together get wired together — what makes two memories link rather than stay separate — and what the corpus says about that mechanism in both brains and models.

This reads the question as being about the linking mechanism itself: when does simultaneous activation cause two pieces of memory to bind, and when do they stay isolated? The corpus doesn't have a single paper that says 'neurons that fire together wire together,' but several notes circle the same territory under different names, and read together they sketch a surprisingly concrete answer.

The sharpest evidence comes from work on priming after learning Can we predict keyword priming before learning happens?. It finds that whether a keyword gets linked to a context after training is predictable from how probable that keyword already was *before* learning — there's a threshold (around 10^-3) below which co-occurrence simply doesn't form a link, and above which just three exposures are enough. That's the core of your question made measurable: co-activation only forges a connection when the pieces were already close enough in the representation to begin with. Linking isn't automatic; it has an ignition point.

But co-activation linking the *wrong* things is its own failure mode. In chain-of-thought reasoning, 'local memorization' — binding to whatever tokens just fired immediately before — drives up to 67% of errors Where do memorization errors arise in chain-of-thought reasoning?. So proximity-of-activation is a double-edged sword: it's how useful associations form and how spurious ones intrude. And the binding problem Why do neural networks fail at compositional generalization? explains the structural limit underneath both: neural nets struggle to keep co-active features *segregated* into distinct entities, so co-activation can blur things together that should stay separate. Linking and over-linking are the same machinery seen from two sides.

What decides whether co-activation produces a durable, well-organized link rather than a transient or noisy one seems to be consolidation. The brain-memory mapping note Can brain memory systems explain how LLMs should store knowledge? frames this directly: rapid hippocampal-style encoding captures co-occurrences fast, but it takes a slower consolidation step to integrate them into stable, distributed knowledge — and the note flags missing consolidation as exactly why current systems fail to integrate memories well. Memory-amortized inference Can cognition work by reusing memory instead of recomputing? complements this: it pictures cognition as navigating a *topological* memory where reused inference paths become the linked structure — co-activation here is the act of traversing the same trajectory again, which is what carves the link deeper.

The thing you may not have known to ask: linking isn't only about *what* co-activates but *which abstraction level* survives. The personalization work Does abstract preference knowledge outperform specific interaction recall? shows that abstract semantic summaries beat raw recalled episodes — meaning the connections worth keeping aren't the literal co-occurrences but the compressed pattern distilled from many of them. Co-activation feeds the raw material; consolidation and abstraction decide which links are worth keeping.

Sources 6 notes

Can we predict keyword priming before learning happens?

Pre-learning keyword probability strongly predicts post-learning priming across architectures and model sizes, with a ~10^-3 threshold separating contexts where priming occurs from those where it doesn't. Just 3 training exposures suffice to establish the effect.

Where do memorization errors arise in chain-of-thought reasoning?

STIM framework identifies local, mid-range, and long-range memorization sources in CoT reasoning. Local memorization—based on preceding tokens—accounts for up to 67% of reasoning errors, especially as complexity increases and distributional shift occurs.

Why do neural networks fail at compositional generalization?

Greff et al. argue that neural networks cannot dynamically bind distributed information into compositional structures due to three failures: segregating entities from inputs, maintaining representational separation, and reusing learned structure in novel combinations. Scaling can partially overcome this by enabling compositional representations to emerge.

Can brain memory systems explain how LLMs should store knowledge?

Research shows transformer weights function as a distributed neocortex for consolidated knowledge, RAG stores as hippocampal indexing for rapid encoding, and agentic state as prefrontal executive control. The CLS framework predicts why hybrid systems outperform single-tier approaches and identifies missing consolidation mechanisms that prevent memory integration.

Can cognition work by reusing memory instead of recomputing?

Memory-Amortized Inference proposes intelligence arises from structured reuse of prior inference paths over topological memory, inverting RL's reward-forward logic into cause-backward reconstruction. This duality explains energy efficiency and suggests memory trajectories form the substrate of adaptive thought.

Does abstract preference knowledge outperform specific interaction recall?

PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.

How does co-activation shape which memories become linked together?

Sources 6 notes

Next inquiring lines