Abstract
Offline methods for reinforcement learning have a potential to help bridgethe gap between reinforcement learning research and real-world applications.They make it possible to learn policies from offline datasets, thus overcomingconcerns associated with online data collection in the real-world, includingcost, safety, or ethical concerns. In this paper, we propose a benchmark calledRL Unplugged to evaluate and compare offline RL methods. RL Unplugged includesdata from a diverse range of domains including games (e.g., Atari benchmark)and simulated motor control problems (e.g., DM Control Suite). The datasetsinclude domains that are partially or fully observable, use continuous ordiscrete actions, and have stochastic vs. deterministic dynamics. We proposedetailed evaluation protocols for each domain in RL Unplugged and provide anextensive analysis of supervised learning and offline RL methods using theseprotocols. We will release data for all our tasks and open-source allalgorithms presented in this paper. We hope that our suite of benchmarks willincrease the reproducibility of experiments and make it possible to studychallenging tasks with a limited computational budget, thus making RL researchboth more systematic and more accessible across the community. Moving forward,we view RL Unplugged as a living benchmark suite that will evolve and grow withdatasets contributed by the research community and ourselves. Our project pageis available on \href{https://git.io/JJUhd}{github}.