Offline Learning of Counterfactual Perception as Prediction for Real-World Robotic Reinforcement Learning

  • 2020-11-11 15:45:17
  • Jun Jin, Daniel Graves, Cameron Haigh, Jun Luo, Martin Jagersand
  • 1

Abstract

We propose a method for offline learning of counterfactual predictions toaddress real world robotic reinforcement learning challenges. The proposedmethod encodes action-oriented visual observations as several "what if"questions learned offline from prior experience using reinforcement learningmethods. These "what if" questions counterfactually predict howaction-conditioned observation would evolve on multiple temporal scales if theagent were to stick to its current action. We show that combining these offlinecounterfactual predictions along with online in-situ observations (e.g. forcefeedback) allows efficient policy learning with only a sparse terminal(success/failure) reward. We argue that the learned predictions form aneffective representation of the visual task, and guide the online explorationtowards high-potential success interactions (e.g. contact-rich regions).Experiments were conducted in both simulation and real-world scenarios forevaluation. Our results demonstrate that it is practical to train areinforcement learning agent to perform real-world fine manipulation in abouthalf a day, without hand engineered perception systems or calibratedinstrumentation. Recordings of the real robot training can be found viahttps://sites.google.com/view/realrl.

 

Quick Read (beta)

loading the full paper ...