Learning Representations in Reinforcement Learning:An Information Bottleneck Approach

Abstract

The information bottleneck principle is an elegant and useful approach torepresentation learning. In this paper, we investigate the problem ofrepresentation learning in the context of reinforcement learning using theinformation bottleneck framework, aiming at improving the sample efficiency ofthe learning algorithms. %by accelerating the process of discarding irrelevantinformation when the %input states are extremely high-dimensional. Weanalytically derive the optimal conditional distribution of the representation,and provide a variational lower bound. Then, we maximize this lower bound withthe Stein variational (SV) gradient method. We incorporate this framework inthe advantageous actor critic algorithm (A2C) and the proximal policyoptimization algorithm (PPO). Our experimental results show that our frameworkcan improve the sample efficiency of vanilla A2C and PPO significantly.Finally, we study the information bottleneck (IB) perspective in deep RL withthe algorithm called mutual information neural estimation(MINE) . Weexperimentally verify that the information extraction-compression process alsoexists in deep RL and our framework is capable of accelerating this process. Wealso analyze the relationship between MINE and our method, through thisrelationship, we theoretically derive an algorithm to optimize our IB frameworkwithout constructing the lower bound.

Quick Read (beta)

loading the full paper ...