Abstract
Deep reinforcement learning (DRL) allows a system to interact with itsenvironment and take actions by training an efficient policy that maximizesself-defined rewards. In autonomous driving, it can be used as a strategy forhigh-level decision making, whereas low-level algorithms such as the hybrid A*path planning have proven their ability to solve the local trajectory planningproblem. In this work, we combine these two methods where the DRL makeshigh-level decisions such as lane change commands. After obtaining the lanechange command, the hybrid A* planner is able to generate a collision-freetrajectory to be executed by a model predictive controller (MPC). In addition,the DRL algorithm is able to keep the lane change command consistent within achosen time-period. Traffic rules are implemented using linear temporal logic(LTL), which is then utilized as a reward function in DRL. Furthermore, wevalidate the proposed method on a real system to demonstrate its feasibilityfrom simulation to implementation on real hardware.