The Deep Weight Prior - a Prior Distribution for CNNs via Generative Modeling of Parameters of the Model

Abstract

Bayesian inference is known to provide a general framework for incorporatingprior knowledge or specific properties into machine learning models viacarefully choosing a prior distribution. In this work, we propose a new type ofprior distributions for convolutional neural networks, deep weight prior, thatin contrast to previously published techniques, favors empirically estimatedstructure of convolutional filters e.g., spatial correlations of weights. Wedefine deep weight prior as an implicit distribution and propose a method forvariational inference with such type of implicit priors. In experiments, weshow that deep weight priors can improve the performance of Bayesian neuralnetworks on several problems when training data is limited. Also, we found thatinitialization of weights of conventional networks with samples from deepweight prior leads to faster training.

Quick Read (beta)

loading the full paper ...