Abstract
Safety is a crucial property of every robotic platform: any control policyshould always comply with actuator limits and avoid collisions with theenvironment and humans. In reinforcement learning, safety is even morefundamental for exploring an environment without causing any damage. Whilethere are many proposed solutions to the safe exploration problem, only a fewof them can deal with the complexity of the real world. This paper introduces anew formulation of safe exploration for reinforcement learning of variousrobotic tasks. Our approach applies to a wide class of robotic platforms andenforces safety even under complex collision constraints learned from data byexploring the tangent space of the constraint manifold. Our proposed approachachieves state-of-the-art performance in simulated high-dimensional and dynamictasks while avoiding collisions with the environment. We show safe real-worlddeployment of our learned controller on a TIAGo++ robot, achieving remarkableperformance in manipulation and human-robot interaction tasks.