Recurrent Parameter Generators

  • 2021-07-15 04:23:59
  • Jiayun Wang, Yubei Chen, Stella X. Yu, Brian Cheung, Yann LeCun
  • 30

Abstract

We present a generic method for recurrently using the same parameters formany different convolution layers to build a deep network. Specifically, for anetwork, we create a recurrent parameter generator (RPG), from which theparameters of each convolution layer are generated. Though using recurrentmodels to build a deep convolutional neural network (CNN) is not entirely new,our method achieves significant performance gain compared to the existingworks. We demonstrate how to build a one-layer neural network to achievesimilar performance compared to other traditional CNN models on variousapplications and datasets. Such a method allows us to build an arbitrarilycomplex neural network with any amount of parameters. For example, we build aResNet34 with model parameters reduced by more than $400$ times, which stillachieves $41.6\%$ ImageNet top-1 accuracy. Furthermore, we demonstrate the RPGcan be applied at different scales, such as layers, blocks, or evensub-networks. Specifically, we use the RPG to build a ResNet18 network with thenumber of weights equivalent to one convolutional layer of a conventionalResNet and show this model can achieve $67.2\%$ ImageNet top-1 accuracy. Theproposed method can be viewed as an inverse approach to model compression.Rather than removing the unused parameters from a large model, it aims tosqueeze more information into a small number of parameters. Extensiveexperiment results are provided to demonstrate the power of the proposedrecurrent parameter generator.

 

Quick Read (beta)

loading the full paper ...