Abstract
Diffusion-based visuomotor policies excel at learning complex robotic tasksby effectively combining visual data with high-dimensional, multi-modal actiondistributions. However, diffusion models often suffer from slow inference dueto costly denoising processes or require complex sequential training arisingfrom recent distilling approaches. This paper introduces Riemannian FlowMatching Policy (RFMP), a model that inherits the easy training and fastinference capabilities of flow matching (FM). Moreover, RFMP inherentlyincorporates geometric constraints commonly found in realistic roboticapplications, as the robot state resides on a Riemannian manifold. To enhancethe robustness of RFMP, we propose Stable RFMP (SRFMP), which leveragesLaSalle's invariance principle to equip the dynamics of FM with stability tothe support of a target Riemannian distribution. Rigorous evaluation on eightsimulated and real-world tasks show that RFMP successfully learns andsynthesizes complex sensorimotor policies on Euclidean and Riemannian spaceswith efficient training and inference phases, outperforming Diffusion Policiesand Consistency Policies.