Contrastive Variational Reinforcement Learning for Complex Observations

Abstract

Deep reinforcement learning (DRL) has achieved significant success in variousrobot tasks: manipulation, navigation, etc. However, complex visualobservations in natural environments remains a major challenge. This paperpresents Contrastive Variational Reinforcement Learning (CVRL), a model-basedmethod that tackles complex visual observations in DRL. CVRL learns acontrastive variational model by maximizing the mutual information betweenlatent states and observations discriminatively, through contrastive learning.It avoids modeling the complex observation space unnecessarily, as the commonlyused generative observation model often does, and is significantly more robust.CVRL achieves comparable performance with state-of-the-art model-based DRLmethods on standard Mujoco tasks. It significantly outperforms them on NaturalMujoco tasks and a robot box-pushing task with complex observations, e.g.,dynamic shadows. The CVRL code is available publicly athttps://github.com/Yusufma03/CVRL.

Quick Read (beta)

loading the full paper ...