Reasoning and Learning Architectures

Can deep learning theory unify around training dynamics?

Is learning mechanics—focused on average-case predictions and training dynamics rather than worst-case bounds—the emerging framework that finally unifies fragmented deep learning theory?

Note · 2026-05-18 · sourced from Foundation Models

Deep learning is the most powerful and most inscrutable member of the machine learning pantheon. Decades of attempts to put rigorous theoretical backing behind it have produced fragments — solvable toy models, scaling laws, hyperparameter limits, universal behaviors — but no unified frame. The argument in There Will Be a Scientific Theory of Deep Learning is that these fragments are not isolated; they are converging into a single discipline that the authors call learning mechanics.

Five strands point at the unification: (1) solvable idealized settings provide intuition for realistic systems, (2) tractable limits reveal fundamental phenomena, (3) simple mathematical laws capture macroscopic observables, (4) hyperparameter theories disentangle which parameters drive behavior, and (5) universal behaviors across systems clarify which phenomena need explanation. Each of these mirrors a move that classical, continuum, statistical, or quantum mechanics made for physical systems. The analogy is structural, not rhetorical: both fields develop libraries of solvable settings, both work with aggregate statistics rather than per-particle motion, both treat system parameters as first-class objects, and both encounter universality across regimes.

The methodological consequence is sharp. Learning mechanics aims at average-case predictions over rigorous worst-case bounds. This is a distinct epistemic project from learning theory's PAC-style guarantees and from interpretability's per-circuit causal accounts. It is concerned with what happens during training, with dynamics rather than endpoints, and with phenomena that are robust across architecture and dataset choices.

The paper anticipates a complementary relationship with mechanistic interpretability — "where mechanistic interpretability aims to be the biology of deep learning, learning mechanics should aspire to be its physics." Mech interp dissects specific circuits in specific models; learning mechanics characterizes the dynamics any sufficiently large network exhibits during training. Both are necessary; neither is sufficient alone.

Related concepts in this collection

Concept map
14 direct connections · 116 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

learning mechanics is the emerging unifying frame for deep learning theory — concerned with training dynamics and average-case predictions not worst-case bounds