On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics

Abstract

We develop Banach spaces for ReLU neural networks of finite depth $L$ andinfinite width. The spaces contain all finite fully connected $L$-layernetworks and their $L^2$-limiting objects under bounds on the naturalpath-norm. Under this norm, the unit ball in the space for $L$-layer networkshas low Rademacher complexity and thus favorable generalization properties.Functions in these spaces can be approximated by multi-layer neural networkswith dimension-independent convergence rates. The key to this work is a new way of representing functions in some form ofexpectations, motivated by multi-layer neural networks. This representationallows us to define a new class of continuous models for machine learning. Weshow that the gradient flow defined this way is the natural continuous analogof the gradient descent dynamics for the associated multi-layer neuralnetworks. We show that the path-norm increases at most polynomially under thiscontinuous gradient flow dynamics.

Quick Read (beta)

loading the full paper ...