Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

Abstract

Deep reinforcement learning (RL) algorithms can learn complex robotic skillsfrom raw sensory inputs, but have yet to achieve the kind of broadgeneralization and applicability demonstrated by deep learning methods insupervised domains. We present a deep RL method that is practical forreal-world robotics tasks, such as robotic manipulation, and generalizeseffectively to never-before-seen tasks and objects. In these settings, groundtruth reward signals are typically unavailable, and we therefore propose aself-supervised model-based approach, where a predictive model learns todirectly predict the future from raw sensory readings, such as camera images.At test time, we explore three distinct goal specification methods: designatedpixels, where a user specifies desired object manipulation tasks by selectingparticular pixels in an image and corresponding goal positions, goal images,where the desired goal state is specified with an image, and image classifiers,which define spaces of goal states. Our deep predictive models are trainedusing data collected autonomously and continuously by a robot interacting withhundreds of objects, without human supervision. We demonstrate that visual MPCcan generalize to never-before-seen objects---both rigid and deformable---andsolve a range of user-defined object manipulation tasks using the same model.

Quick Read (beta)

loading the full paper ...