PLANRL: A Motion Planning and Imitation Learning Framework to Bootstrap Reinforcement Learning

Abstract

Reinforcement Learning (RL) has shown remarkable progress in simulationenvironments, yet its application to real-world robotic tasks remains limiteddue to challenges in exploration and generalization. To address these issues,we introduce PLANRL, a framework that chooses when the robot should useclassical motion planning and when it should learn a policy. To further improvethe efficiency in exploration, we use imitation data to bootstrap theexploration. PLANRL dynamically switches between two modes of operation:reaching a waypoint using classical techniques when away from the objects andreinforcement learning for fine-grained manipulation control when about tointeract with objects. PLANRL architecture is composed of ModeNet for modeclassification, NavNet for waypoint prediction, and InteractNet for precisemanipulation. By combining the strengths of RL and Imitation Learning (IL),PLANRL improves sample efficiency and mitigates distribution shift, ensuringrobust task execution. We evaluate our approach across multiple challengingsimulation environments and real-world tasks, demonstrating superiorperformance in terms of adaptability, efficiency, and generalization comparedto existing methods. In simulations, PLANRL surpasses baseline methods by10-15\% in training success rates at 30k samples and by 30-40\% duringevaluation phases. In real-world scenarios, it demonstrates a 30-40\% highersuccess rate on simpler tasks compared to baselines and uniquely succeeds incomplex, two-stage manipulation tasks. Datasets and supplementary materials canbe found on our {https://raaslab.org/projects/NAVINACT/}.

Quick Read (beta)

loading the full paper ...