Abstract
Hypernetworks are neural networks that transform a random input vector intoweights for a specified target neural network. We formulate the hypernetworktraining objective as a compromise between accuracy and diversity, where thediversity takes into account trivial symmetry transformations of the targetnetwork. We show that this formulation naturally arises as a relaxation of anoptimistic probability distribution objective for the generated networks, andwe explain how it is related to variational inference. We use multi-layeredperceptrons to form the mapping from the low dimensional input random vector tothe high dimensional weight space, and demonstrate how to reduce the number ofparameters in this mapping by weight sharing. We perform experiments on a fourlayer convolutional target network which classifies MNIST images, and show thatthe generated weights are diverse and have interesting distributions.