Reinforcement Learning with Videos: Combining Offline Observations with Interaction

Abstract

Reinforcement learning is a powerful framework for robots to acquire skillsfrom experience, but often requires a substantial amount of online datacollection. As a result, it is difficult to collect sufficiently diverseexperiences that are needed for robots to generalize broadly. Videos of humans,on the other hand, are a readily available source of broad and interestingexperiences. In this paper, we consider the question: can we performreinforcement learning directly on experience collected by humans? This problemis particularly difficult, as such videos are not annotated with actions andexhibit substantial visual domain shift relative to the robot's embodiment. Toaddress these challenges, we propose a framework for reinforcement learningwith videos (RLV). RLV learns a policy and value function using experiencecollected by humans in combination with data collected by robots. In ourexperiments, we find that RLV is able to leverage such videos to learnchallenging vision-based skills with less than half as many samples as RLmethods that learn from scratch.

Quick Read (beta)

loading the full paper ...