Abstract
Learning-based decision-making has the potential to enable generalizableAutonomous Driving (AD) policies, reducing the engineering overhead ofrule-based approaches. Imitation Learning (IL) remains the dominant paradigm,benefiting from large-scale human demonstration datasets, but it suffers frominherent limitations such as distribution shift and imitation gaps.Reinforcement Learning (RL) presents a promising alternative, yet its adoptionin AD remains limited due to the lack of standardized and efficient researchframeworks. To this end, we introduce V-Max, an open research frameworkproviding all the necessary tools to make RL practical for AD. V-Max is builton Waymax, a hardware-accelerated AD simulator designed for large-scaleexperimentation. We extend it using ScenarioNet's approach, enabling the fastsimulation of diverse AD datasets.