Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning

Abstract

Learning informative representations from image-based observations is offundamental concern in deep Reinforcement Learning (RL). However,data-inefficiency remains a significant barrier to this objective. To overcomethis obstacle, we propose to accelerate state representation learning byenforcing view-consistency on the dynamics. Firstly, we introduce a formalismof Multi-view Markov Decision Process (MMDP) that incorporates multiple viewsof the state. Following the structure of MMDP, our method, View-ConsistentDynamics (VCD), learns state representations by training a view-consistentdynamics model in the latent space, where views are generated by applying dataaugmentation to states. Empirical evaluation on DeepMind Control Suite andAtari-100k demonstrates VCD to be the SoTA data-efficient algorithm on visualcontrol tasks.

Quick Read (beta)

loading the full paper ...