Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Abstract

Reinforcement learning systems require good representations to work well. Fordecades practical success in reinforcement learning was limited to smalldomains. Deep reinforcement learning systems, on the other hand, are scalable,not dependent on domain specific prior knowledge and have been successfullyused to play Atari, in 3D navigation from pixels, and to control high degree offreedom robots. Unfortunately, the performance of deep reinforcement learningsystems is sensitive to hyper-parameter settings and architecture choices. Evenwell tuned systems exhibit significant instability both within a trial andacross experiment replications. In practice, significant expertise and trialand error are usually required to achieve good performance. One potentialsource of the problem is known as catastrophic interference: when latertraining decreases performance by overriding previous learning. Interestingly,the powerful generalization that makes Neural Networks (NN) so effective inbatch supervised learning might explain the challenges when applying them inreinforcement learning tasks. In this paper, we explore how online NN trainingand interference interact in reinforcement learning. We find that simplyre-mapping the input observations to a high-dimensional space improves learningspeed and parameter sensitivity. We also show this preprocessing reducesinterference in prediction tasks. More practically, we provide a simpleapproach to NN training that is easy to implement, and requires littleadditional computation. We demonstrate that our approach improves performancein both prediction and control with an extensive batch of experiments inclassic control domains.

Quick Read (beta)

loading the full paper ...