Abstract
This paper addresses the challenges of training end-to-end autonomous drivingagents using Reinforcement Learning (RL). RL agents are typically trained in afixed set of scenarios and nominal behavior of surrounding road users insimulations, limiting their generalization and real-life deployment. Whiledomain randomization offers a potential solution by randomly sampling drivingscenarios, it frequently results in inefficient training and sub-optimalpolicies due to the high variance among training scenarios. To address theselimitations, we propose an automatic curriculum learning framework thatdynamically generates driving scenarios with adaptive complexity based on theagent's evolving capabilities. Unlike manually designed curricula thatintroduce expert bias and lack scalability, our framework incorporates a``teacher'' that automatically generates and mutates driving scenarios based ontheir learning potential -- an agent-centric metric derived from the agent'scurrent policy -- eliminating the need for expert design. The frameworkenhances training efficiency by excluding scenarios the agent has mastered orfinds too challenging. We evaluate our framework in a reinforcement learningsetting where the agent learns a driving policy from camera images. Comparativeresults against baseline methods, including fixed scenario training and domainrandomization, demonstrate that our approach leads to enhanced generalization,achieving higher success rates: +9\% in low traffic density, +21\% in hightraffic density, and faster convergence with fewer training steps. Our findingshighlight the potential of ACL in improving the robustness and efficiency ofRL-based autonomous driving agents.