Latent Diffusion Planning for Imitation Learning

Abstract

Recent progress in imitation learning has been enabled by policyarchitectures that scale to complex visuomotor tasks, multimodal distributions,and large datasets. However, these methods often rely on learning from largeamount of expert demonstrations. To address these shortcomings, we proposeLatent Diffusion Planning (LDP), a modular approach consisting of a plannerwhich can leverage action-free demonstrations, and an inverse dynamics modelwhich can leverage suboptimal data, that both operate over a learned latentspace. First, we learn a compact latent space through a variationalautoencoder, enabling effective forecasting of future states in image-baseddomains. Then, we train a planner and an inverse dynamics model with diffusionobjectives. By separating planning from action prediction, LDP can benefit fromthe denser supervision signals of suboptimal and action-free data. On simulatedvisual robotic manipulation tasks, LDP outperforms state-of-the-art imitationlearning approaches, as they cannot leverage such additional data.

Quick Read (beta)

loading the full paper ...