Parameter Symmetry Potentially Unifies Deep Learning Theory

Abstract

The dynamics of learning in modern large AI systems is hierarchical, oftencharacterized by abrupt, qualitative shifts akin to phase transitions observedin physical systems. While these phenomena hold promise for uncovering themechanisms behind neural networks and language models, existing theories remainfragmented, addressing specific cases. In this position paper, we advocate forthe crucial role of the research direction of parameter symmetries in unifyingthese fragmented theories. This position is founded on a centralizinghypothesis for this direction: parameter symmetry breaking and restoration arethe unifying mechanisms underlying the hierarchical learning behavior of AImodels. We synthesize prior observations and theories to argue that thisdirection of research could lead to a unified understanding of three distincthierarchies in neural networks: learning dynamics, model complexity, andrepresentation formation. By connecting these hierarchies, our position paperelevates symmetry -- a cornerstone of theoretical physics -- to become apotential fundamental principle in modern AI.

Quick Read (beta)

loading the full paper ...