Multigrid-in-Channels Architectures for Wide Convolutional Neural Networks

Abstract

We present a multigrid approach that combats the quadratic growth of thenumber of parameters with respect to the number of channels in standardconvolutional neural networks (CNNs). It has been shown that there is aredundancy in standard CNNs, as networks with much sparser convolutionoperators can yield similar performance to full networks. The sparsity patternsthat lead to such behavior, however, are typically random, hampering hardwareefficiency. In this work, we present a multigrid-in-channels approach forbuilding CNN architectures that achieves full coupling of the channels, andwhose number of parameters is linearly proportional to the width of thenetwork. To this end, we replace each convolution layer in a generic CNN with amultilevel layer consisting of structured (i.e., grouped) convolutions. Ourexamples from supervised image classification show that applying this strategyto residual networks and MobileNetV2 considerably reduces the number ofparameters without negatively affecting accuracy. Therefore, we can widennetworks without dramatically increasing the number of parameters oroperations.

Quick Read (beta)

loading the full paper ...