There Will Be a Scientific Theory of Deep Learning

Paper · arXiv 2604.21691

In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (1) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (2) tractable limits that reveal insights into fundamental learning phenomena; (3) simple mathematical laws that capture important macroscopic observables; (4) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (5) universal behaviors shared across systems and settings which clarify which phenomena call for explanation.

Deep learning is famously a black-box learning method, the most powerful, most inscrutable, and now most technologically important member of the machine learning pantheon. Properly trained, neural networks learn to perform a wide array of tasks with superhuman performance, but we have no unified scientific framework that explains why or how. Motivated by both scientific curiosity and the promise of practical engineering benefit, the effort to put rigorous mathematical and scientific backing behind this applied discipline has spanned decades. Despite some progress, however, our understanding remains primitive: neural networks are still trained using methods discovered largely through trial and error rather than first principles, and theory plays little role in the day-to-day practice of deep learning. This paper makes the case that, yes, there will be a scientific theory of deep learning; that we can see pieces of this theory starting to emerge; and that this theory will take the form of a mechanics of the learning process.

We argue that the emerging theory is best understood as a mechanics of the learning process—by analogy to classical, continuum, statistical, and quantum mechanics—and suggest the name learning mechanics. The emerging science shares deep similarities with established branches of mechanics. All branches of mechanics develop a library of analytically solvable settings to gain intuition; so too does learning mechanics. Continuum and statistical mechanics describe zoomed-out summary statistics rather than the motion of every particle; this has also proven a useful approach in dealing with the complexity of deep learning. Every physical system has one or more system parameters (characteristic scales, coupling constants, etc.) affecting its behavior, and some techniques for treating these are essentially the same as those used to study hyperparameters in deep learning. Finally, physics is full of cases in which the same phenomena show up in very different settings, and similarly we see universal behavior emerging across deep learning systems.

These lines of research broadly share several overarching characteristics: they are concerned with the dynamics of the training process; they primarily seek to describe coarse aggregate statistics of learning; and they emphasize accurate average-case predictions over rigorous worst-case bounds. In this sense, the emerging scientific theory of neural networks appears to have much in common with theories in physics such as classical mechanics, continuum mechanics, statistical mechanics, and quantum mechanics. We discuss the relationship between this mechanics perspective and other approaches for building a theory of deep learning, including the statistical and information-theoretic perspectives. In particular, we anticipate a symbiotic and mutually supportive relationship between learning mechanics and the developing discipline of mechanistic interpretability. Where mechanistic interpretability aims to be the biology of deep learning, learning mechanics should aspire to be its physics, mirroring the complementary relationship between biology and physics in the natural sciences.

We also review and address common arguments that fundamental theory will not be possible or is not important. On the argument that AI will understand itself before we do: theory is already useful, and will continue to be more impactful as it develops. It seems unlikely that AI working in isolation will suddenly and separately "solve deep learning theory." It seems more likely that breakthrough progress in a transitory period will come from human scientists using or working with AI, and expert humans will remain in the loop. If one's goal is AI safety, some human oversight of AI systems will be necessary, and having a human-parseable theory of deep learning gives us a foot in the door.

If the state variable or artifact does not exist, the agent will raise an error. If you want to ignore the error, you can append a ? to the variable name as in {var?} .

There Will Be a Scientific Theory of Deep Learning

Synthesis notes that discuss concepts related to this paper