Understanding and Preventing Capacity Loss in Reinforcement Learning

Abstract

The reinforcement learning (RL) problem is rife with sources ofnon-stationarity, making it a notoriously difficult problem domain for theapplication of neural networks. We identify a mechanism by which non-stationaryprediction targets can prevent learning progress in deep RL agents:\textit{capacity loss}, whereby networks trained on a sequence of target valueslose their ability to quickly update their predictions over time. Wedemonstrate that capacity loss occurs in a range of RL agents and environments,and is particularly damaging to performance in sparse-reward tasks. We thenpresent a simple regularizer, Initial Feature Regularization (InFeR), thatmitigates this phenomenon by regressing a subspace of features towards itsvalue at initialization, leading to significant performance improvements insparse-reward environments such as Montezuma's Revenge. We conclude thatpreventing capacity loss is crucial to enable agents to maximally benefit fromthe learning signals they obtain throughout the entire training trajectory.

Quick Read (beta)

loading the full paper ...