Reinitializing weights vs units for maintaining plasticity in neural networks

Abstract

Loss of plasticity is a phenomenon in which a neural network loses itsability to learn when trained for an extended time on non-stationary data. Itis a crucial problem to overcome when designing systems that learn continually.An effective technique for preventing loss of plasticity is reinitializingparts of the network. In this paper, we compare two different reinitializationschemes: reinitializing units vs reinitializing weights. We propose a newalgorithm, which we name \textit{selective weight reinitialization}, forreinitializing the least useful weights in a network. We compare our algorithmto continual backpropagation and ReDo, two previously proposed algorithms thatreinitialize units in the network. Through our experiments in continualsupervised learning problems, we identify two settings when reinitializingweights is more effective at maintaining plasticity than reinitializing units:(1) when the network has a small number of units and (2) when the networkincludes layer normalization. Conversely, reinitializing weights and units areequally effective at maintaining plasticity when the network is of sufficientsize and does not include layer normalization. We found that reinitializingweights maintains plasticity in a wider variety of settings than reinitializingunits.

Quick Read (beta)

loading the full paper ...