We propose a method for offline learning of counterfactual predictions toaddress real world robotic reinforcement learning challenges. The proposedmethod encodes action-oriented visual observations as several "what if"questions learned offline from prior experience using reinforcement learningmethods. These "what if" questions counterfactually predict howaction-conditioned observation would evolve on multiple temporal scales if theagent were to stick to its current action. We show that combining these offlinecounterfactual predictions along with online in-situ observations (e.g. forcefeedback) allows efficient policy learning with only a sparse terminal(success/failure) reward. We argue that the learned predictions form aneffective representation of the visual task, and guide the online explorationtowards high-potential success interactions (e.g. contact-rich regions).Experiments were conducted in both simulation and real-world scenarios forevaluation. Our results demonstrate that it is practical to train areinforcement learning agent to perform real-world fine manipulation in abouthalf a day, without hand engineered perception systems or calibratedinstrumentation. Recordings of the real robot training can be found viahttps://sites.google.com/view/realrl.