Abstract
Vision and learning have made significant progress that could improverobotics policies for complex tasks and environments. Learning deep neuralnetworks for image understanding, however, requires large amounts ofdomain-specific visual data. While collecting such data from real robots ispossible, such an approach limits the scalability as learning policiestypically requires thousands of trials. In this work we attempt to learnmanipulation policies in simulated environments. Simulators enable scalabilityand provide access to the underlying world state during training. Policieslearned in simulators, however, do not transfer well to real scenes given thedomain gap between real and synthetic data. We follow recent work on domainrandomization and augment synthetic images with sequences of randomtransformations. Our main contribution is to optimize the augmentation strategyfor sim2real transfer and to enable domain-independent policy learning. Wedesign an efficient search for depth image augmentations using objectlocalization as a proxy task. Given the resulting sequence of randomtransformations, we use it to augment synthetic depth images during policylearning. Our augmentation strategy is policy-independent and enables policylearning with no real images. We demonstrate our approach to significantlyimprove accuracy on three manipulation tasks evaluated on a real robot.