Learning Invariant Representations for Reinforcement Learning without Reconstruction

Abstract

We study how representation learning can accelerate reinforcement learningfrom rich observations, such as images, without relying either on domainknowledge or pixel-reconstruction. Our goal is to learn representations thatboth provide for effective downstream control and invariance to task-irrelevantdetails. Bisimulation metrics quantify behavioral similarity between states incontinuous MDPs, which we propose using to learn robust latent representationswhich encode only the task-relevant information from observations. Our methodtrains encoders such that distances in latent space equal bisimulationdistances in state space. We demonstrate the effectiveness of our method atdisregarding task-irrelevant information using modified visual MuJoCo tasks,where the background is replaced with moving distractors and natural videos,while achieving SOTA performance. We also test a first-person highway drivingtask where our method learns invariance to clouds, weather, and time of day.Finally, we provide generalization results drawn from properties ofbisimulation metrics, and links to causal inference.

Quick Read (beta)

loading the full paper ...