Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

Abstract

Offline reinforcement learning has shown great promise in leveraging largepre-collected datasets for policy learning, allowing agents to forgooften-expensive online data collection. However, to date, offline reinforcementlearning from has been relatively under-explored, and there is a lack ofunderstanding of where the remaining challenges lie. In this paper, we seek toestablish simple baselines for continuous control in the visual domain. We showthat simple modifications to two state-of-the-art vision-based onlinereinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperformprior work and establish a competitive baseline. We rigorously evaluate thesealgorithms on both existing offline datasets and a new testbed for offlinereinforcement learning from visual observations that better represents the datadistributions present in real-world offline reinforcement learning problems,and open-source our code and data to facilitate progress in this importantdomain. Finally, we present and analyze several key desiderata unique tooffline RL from visual observations, including visual distractions and visuallyidentifiable changes in dynamics.

Quick Read (beta)

loading the full paper ...