Abstract
Visual Reinforcement Learning (RL) methods often require extensive amounts ofdata. As opposed to model-free RL, model-based RL (MBRL) offers a potentialsolution with efficient data utilization through planning. Additionally, RLlacks generalization capabilities for real-world tasks. Prior work has shownthat incorporating pre-trained visual representations (PVRs) enhances sampleefficiency and generalization. While PVRs have been extensively studied in thecontext of model-free RL, their potential in MBRL remains largely unexplored.In this paper, we benchmark a set of PVRs on challenging control tasks in amodel-based RL setting. We investigate the data efficiency, generalizationcapabilities, and the impact of different properties of PVRs on the performanceof model-based agents. Our results, perhaps surprisingly, reveal that for MBRLcurrent PVRs are not more sample efficient than learning representations fromscratch, and that they do not generalize better to out-of-distribution (OOD)settings. To explain this, we analyze the quality of the trained dynamicsmodel. Furthermore, we show that data diversity and network architecture arethe most important contributors to OOD generalization performance.