QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

  • 2018-11-28 02:40:54
  • Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine
  0


In this paper, we study the problem of learning vision-based dynamicmanipulation skills using a scalable reinforcement learning approach. We studythis problem in the context of grasping, a longstanding challenge in roboticmanipulation. In contrast to static learning behaviors that choose a grasppoint and then execute the desired grasp, our method enables closed-loopvision-based control, whereby the robot continuously updates its grasp strategybased on the most recent observations to optimize long-horizon grasp success.To that end, we introduce QT-Opt, a scalable self-supervised vision-basedreinforcement learning framework that can leverage over 580k real-world graspattempts to train a deep neural network Q-function with over 1.2M parameters toperform closed-loop, real-world grasping that generalizes to 96% grasp successon unseen objects. Aside from attaining a very high success rate, our methodexhibits behaviors that are quite distinct from more standard grasping systems:using only RGB vision-based perception from an over-the-shoulder camera, ourmethod automatically learns regrasping strategies, probes objects to find themost effective grasps, learns to reposition objects and perform othernon-prehensile pre-grasp manipulations, and responds dynamically todisturbances and perturbations.


