SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation

Abstract

Robot learning has proven to be a general and effective technique forprogramming manipulators. Imitation learning is able to teach robots solelyfrom human demonstrations but is bottlenecked by the capabilities of thedemonstrations. Reinforcement learning uses exploration to discover betterbehaviors; however, the space of possible improvements can be too large tostart from scratch. And for both techniques, the learning difficulty increasesproportional to the length of the manipulation task. Accounting for this, wepropose SPIRE, a system that first uses Task and Motion Planning (TAMP) todecompose tasks into smaller learning subproblems and second combines imitationand reinforcement learning to maximize their strengths. We develop novelstrategies to train learning agents when deployed in the context of a planningsystem. We evaluate SPIRE on a suite of long-horizon and contact-rich robotmanipulation problems. We find that SPIRE outperforms prior approaches thatintegrate imitation learning, reinforcement learning, and planning by 35% to50% in average task performance, is 6 times more data efficient in the numberof human demonstrations needed to train proficient agents, and learns tocomplete tasks nearly twice as efficiently. Viewhttps://sites.google.com/view/spire-corl-2024 for more details.

Quick Read (beta)

loading the full paper ...