A Spline Theory of Deep Networks (Extended Version)

Abstract

We build a rigorous bridge between deep networks (DNs) and approximationtheory via spline functions and operators. Our key result is that a large classof DNs can be written as a composition of max-affine spline operators (MASOs),which provide a powerful portal through which to view and analyze their innerworkings. For instance, conditioned on the input signal, the output of a MASODN can be written as a simple affine transformation of the input. This impliesthat a DN constructs a set of signal-dependent, class-specific templatesagainst which the signal is compared via a simple inner product; we explore thelinks to the classical theory of optimal classification via matched filters andthe effects of data memorization. Going further, we propose a simple penaltyterm that can be added to the cost function of any DN learning algorithm toforce the templates to be orthogonal with each other; this leads tosignificantly improved classifi- cation performance and reduced overfittingwith no change to the DN architecture. The spline partition of the input signalspace that is implicitly induced by a MASO directly links DNs to the theory ofvector quantization (VQ) and K-means clustering, which opens up new geometricavenue to study how DNs organize signals in a hierarchical fashion. To validatethe utility of the VQ interpretation, we develop and validate a new distancemetric for signals and images that quantifies the difference between their VQencodings. (This paper is a significantly expanded version of a paper with thesame title that will appear at ICML 2018.)

Quick Read (beta)

loading the full paper ...