Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy For Visuomotor Imitation Learning

Abstract

We propose an object-centric recovery policy framework to address thechallenges of out-of-distribution (OOD) scenarios in visuomotor policylearning. Previous behavior cloning (BC) methods rely heavily on a large amountof labeled data coverage, failing in unfamiliar spatial states. Without relyingon extra data collection, our approach learns a recovery policy constructed byan inverse policy inferred from object keypoint manifold gradient in theoriginal training data. The recovery policy serves as a simple add-on to anybase visuomotor BC policy, agnostic to a specific method, guiding the systemback towards the training distribution to ensure task success even in OODsituations. We demonstrate the effectiveness of our object-centric framework inboth simulation and real robot experiments, achieving an improvement of$\textbf{77.7\%}$ over the base policy in OOD. Project Website:https://sites.google.com/view/ocr-penn

Quick Read (beta)

loading the full paper ...