On the Expressive Power of Deep Diagonal Circulant Neural Networks

Abstract

In this paper, we study deep diagonal circulant neural networks, that is deepneural networks in which all weight matrices are the product of diagonal andcirculant ones. We show that these networks outperform the recently introduceddeep networks with other types of structured layers. Besides introducingprincipled techniques for training these models, we provide theoreticalguarantees regarding their expressivity. Indeed, we prove that the functionspace spanned by diagonal circulant networks of bounded depth includes the onespanned by dense networks with specific properties on their rank. We conduct athorough experimental study to compare the performance of deep diagonalcirculant networks with state of the art models based on structured matricesand with dense models. We show that our models achieve better accuracy thantheir structured alternatives while required 2x fewer weights as the next bestapproach. Finally we train deep diagonal circulant networks to build a compactand accurate models on a real world video classification dataset with over 3.8million training examples.

Quick Read (beta)

loading the full paper ...