RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

  • 2025-02-18 18:59:21
  • Hao Gao, Shaoyu Chen, Bo Jiang, Bencheng Liao, Yiang Shi, Xiaoyang Guo, Yuechuan Pu, Haoran Yin, Xiangyu Li, Xinbang Zhang, Ying Zhang, Wenyu Liu, Qian Zhang, Xinggang Wang
  • 0

Abstract

Existing end-to-end autonomous driving (AD) algorithms typically follow theImitation Learning (IL) paradigm, which faces challenges such as causalconfusion and the open-loop gap. In this work, we establish a 3DGS-basedclosed-loop Reinforcement Learning (RL) training paradigm. By leveraging 3DGStechniques, we construct a photorealistic digital replica of the real physicalworld, enabling the AD policy to extensively explore the state space and learnto handle out-of-distribution scenarios through large-scale trial and error. Toenhance safety, we design specialized rewards that guide the policy toeffectively respond to safety-critical events and understand real-world causalrelationships. For better alignment with human driving behavior, IL isincorporated into RL training as a regularization term. We introduce aclosed-loop evaluation benchmark consisting of diverse, previously unseen 3DGSenvironments. Compared to IL-based methods, RAD achieves stronger performancein most closed-loop metrics, especially 3x lower collision rate. Abundantclosed-loop results are presented at https://hgao-cv.github.io/RAD.

 

Quick Read (beta)

loading the full paper ...