Understanding training and generalization in deep learning by Fourier analysis

Abstract

Background: It is still an open research area to theoretically understand whyDeep Neural Networks (DNNs)---equipped with many more parameters than trainingdata and trained by (stochastic) gradient-based methods---often achieveremarkably low generalization error. Contribution: We study DNN training byFourier analysis. Our theoretical framework explains: i) DNN with (stochastic)gradient-based methods endows low-frequency components of the target functionwith a higher priority during the training; ii) Small initialization leads togood generalization ability of DNN while preserving the DNN's ability offitting any function. These results are further confirmed by experiments ofDNNs fitting the following datasets, i.e., natural images, one-dimensionalfunctions and MNIST dataset.

Quick Read (beta)

loading the full paper ...