Abstract
Offline reinforcement learning (RL) enables policy optimization in staticdatasets, avoiding the risks and costs of real-world exploration. However, itstruggles with suboptimal behavior learning and inaccurate value estimation dueto the lack of environmental interaction. In this paper, we presentVideo-Enhanced Offline RL (VeoRL), a model-based approach that constructs aninteractive world model from diverse, unlabeled video data readily availableonline. Leveraging model-based behavior guidance, VeoRL transfers commonsenseknowledge of control policy and physical dynamics from natural videos to the RLagent within the target domain. Our method achieves substantial performancegains (exceeding 100% in some cases) across visuomotor control tasks in roboticmanipulation, autonomous driving, and open-world video games.