Abstract
In autonomous driving, traditional Computer Vision (CV) agents often strugglein unfamiliar situations due to biases in the training data. Deep ReinforcementLearning (DRL) agents address this by learning from experience and maximizingrewards, which helps them adapt to dynamic environments. However, ensuringtheir generalization remains challenging, especially with static trainingenvironments. Additionally, DRL models lack transparency, making it difficultto guarantee safety in all scenarios, particularly those not seen duringtraining. To tackle these issues, we propose a method that combines DRL withCurriculum Learning for autonomous driving. Our approach uses a Proximal PolicyOptimization (PPO) agent and a Variational Autoencoder (VAE) to learn safedriving in the CARLA simulator. The agent is trained using two-fold curriculumlearning, progressively increasing environment difficulty and incorporating acollision penalty in the reward function to promote safety. This methodimproves the agent's adaptability and reliability in complex environments, andunderstand the nuances of balancing multiple reward components from differentfeedback signals in a single scalar reward function. Keywords: Computer Vision,Deep Reinforcement Learning, Variational Autoencoder, Proximal PolicyOptimization, Curriculum Learning, Autonomous Driving.