Can deep learning theory unify around training dynamics?
Is learning mechanics—focused on average-case predictions and training dynamics rather than worst-case bounds—the emerging framework that finally unifies fragmented deep learning theory?
Deep learning is the most powerful and most inscrutable member of the machine learning pantheon. Decades of attempts to put rigorous theoretical backing behind it have produced fragments — solvable toy models, scaling laws, hyperparameter limits, universal behaviors — but no unified frame. The argument in There Will Be a Scientific Theory of Deep Learning is that these fragments are not isolated; they are converging into a single discipline that the authors call learning mechanics.
Five strands point at the unification: (1) solvable idealized settings provide intuition for realistic systems, (2) tractable limits reveal fundamental phenomena, (3) simple mathematical laws capture macroscopic observables, (4) hyperparameter theories disentangle which parameters drive behavior, and (5) universal behaviors across systems clarify which phenomena need explanation. Each of these mirrors a move that classical, continuum, statistical, or quantum mechanics made for physical systems. The analogy is structural, not rhetorical: both fields develop libraries of solvable settings, both work with aggregate statistics rather than per-particle motion, both treat system parameters as first-class objects, and both encounter universality across regimes.
The methodological consequence is sharp. Learning mechanics aims at average-case predictions over rigorous worst-case bounds. This is a distinct epistemic project from learning theory's PAC-style guarantees and from interpretability's per-circuit causal accounts. It is concerned with what happens during training, with dynamics rather than endpoints, and with phenomena that are robust across architecture and dataset choices.
The paper anticipates a complementary relationship with mechanistic interpretability — "where mechanistic interpretability aims to be the biology of deep learning, learning mechanics should aspire to be its physics." Mech interp dissects specific circuits in specific models; learning mechanics characterizes the dynamics any sufficiently large network exhibits during training. Both are necessary; neither is sufficient alone.
Related concepts in this collection
-
Do language models understand in fundamentally different ways?
Does mechanistic evidence reveal distinct tiers of understanding in LLMs—from concept recognition to factual knowledge to principled reasoning? And do these tiers coexist rather than replace each other?
adjacent: how to read mechanistic evidence at multiple levels
-
Can cognitive science methods unlock how LLMs actually work?
Does Marr's three-level framework—developed to understand biological minds—offer interpretability researchers the structured methodology they need to decode opaque language models?
adjacent framing: cognitive science methods for LLM interpretation; learning mechanics is the dynamics-of-training axis Marr's framework does not directly address
-
Can humans understand deep learning before AI does?
Explores whether investing in human-parseable deep learning theory remains valuable even if AI systems eventually develop their own self-understanding. Centers on why this matters for safety oversight.
same paper, the safety argument that motivates pursuing the theory now
-
Why do Shannon and Kolmogorov measures fail to value data?
Shannon information and Kolmogorov complexity assume unlimited computational capacity. But do these classical measures actually capture what bounded learners can extract from real data?
exemplifies: the compute-aware average-case turn applied to information theory
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
learning mechanics is the emerging unifying frame for deep learning theory — concerned with training dynamics and average-case predictions not worst-case bounds