Solving internal covariate shift in deep learning with linked neurons

Abstract

This work proposes a novel solution to the problem of internal covariateshift and dying neurons using the concept of linked neurons. We define theneuron linkage in terms of two constraints: first, all neuron activations inthe linkage must have the same operating point. That is to say, all of themshare input weights. Secondly, a set of neurons is linked if and only if thereis at least one member of the linkage that has a non-zero gradient in regard tothe input of the activation function. This means that for any input in theactivation function, there is at least one member of the linkage that operatesin a non-flat and non-zero area. This simple change has profound implicationsin the network learning dynamics. In this article we explore the consequencesof this proposal and show that by using this kind of units, internal covariateshift is implicitly solved. As a result of this, the use of linked neuronsallows to train arbitrarily large networks without any architectural oralgorithmic trick, effectively removing the need of using re-normalizationschemes such as Batch Normalization, which leads to halving the requiredtraining time. It also solves the problem of the need for standarized inputdata. Results show that the units using the linkage not only do effectivelysolve the aforementioned problems, but are also a competitive alternative withrespect to state-of-the-art with very promising results.

Quick Read (beta)

loading the full paper ...