CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-scale Reinforcement Learning in Autonomous Driving

Abstract

Trajectory planning is vital for autonomous driving, ensuring safe andefficient navigation in complex environments. While recent learning-basedmethods, particularly reinforcement learning (RL), have shown promise inspecific scenarios, RL planners struggle with training inefficiencies andmanaging large-scale, real-world driving scenarios. In this paper, we introduce\textbf{CarPlanner}, a \textbf{C}onsistent \textbf{a}uto-\textbf{r}egressive\textbf{Planner} that uses RL to generate multi-modal trajectories. Theauto-regressive structure enables efficient large-scale RL training, while theincorporation of consistency ensures stable policy learning by maintainingcoherent temporal consistency across time steps. Moreover, CarPlanner employs ageneration-selection framework with an expert-guided reward function and aninvariant-view module, simplifying RL training and enhancing policyperformance. Extensive analysis demonstrates that our proposed RL frameworkeffectively addresses the challenges of training efficiency and performanceenhancement, positioning CarPlanner as a promising solution for trajectoryplanning in autonomous driving. To the best of our knowledge, we are the firstto demonstrate that the RL-based planner can surpass both IL- and rule-basedstate-of-the-arts (SOTAs) on the challenging large-scale real-world datasetnuPlan. Our proposed CarPlanner surpasses RL-, IL-, and rule-based SOTAapproaches within this demanding dataset.

Quick Read (beta)

loading the full paper ...