Abstract
The reinforcement learning (RL) problem is rife with sources ofnon-stationarity, making it a notoriously difficult problem domain for theapplication of neural networks. We identify a mechanism by which non-stationaryprediction targets can prevent learning progress in deep RL agents:\textit{capacity loss}, whereby networks trained on a sequence of target valueslose their ability to quickly update their predictions over time. Wedemonstrate that capacity loss occurs in a range of RL agents and environments,and is particularly damaging to performance in sparse-reward tasks. We thenpresent a simple regularizer, Initial Feature Regularization (InFeR), thatmitigates this phenomenon by regressing a subspace of features towards itsvalue at initialization, leading to significant performance improvements insparse-reward environments such as Montezuma's Revenge. We conclude thatpreventing capacity loss is crucial to enable agents to maximally benefit fromthe learning signals they obtain throughout the entire training trajectory.