The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

Abstract

Visual Reinforcement Learning (RL) methods often require extensive amounts ofdata. As opposed to model-free RL, model-based RL (MBRL) offers a potentialsolution with efficient data utilization through planning. Additionally, RLlacks generalization capabilities for real-world tasks. Prior work has shownthat incorporating pre-trained visual representations (PVRs) enhances sampleefficiency and generalization. While PVRs have been extensively studied in thecontext of model-free RL, their potential in MBRL remains largely unexplored.In this paper, we benchmark a set of PVRs on challenging control tasks in amodel-based RL setting. We investigate the data efficiency, generalizationcapabilities, and the impact of different properties of PVRs on the performanceof model-based agents. Our results, perhaps surprisingly, reveal that for MBRLcurrent PVRs are not more sample efficient than learning representations fromscratch, and that they do not generalize better to out-of-distribution (OOD)settings. To explain this, we analyze the quality of the trained dynamicsmodel. Furthermore, we show that data diversity and network architecture arethe most important contributors to OOD generalization performance.

Quick Read (beta)

loading the full paper ...