Back to Basics: Deep Reinforcement Learning in Traffic Signal Control

Abstract

In this paper we revisit some of the fundamental premises for a reinforcementlearning (RL) approach to self-learning traffic lights. We propose RLight, acombination of choices that offers robust performance and good generalizationto unseen traffic flows. In particular, our main contributions are threefold:our lightweight and cluster-aware state representation leads to improvedperformance; we reformulate the Markov Decision Process (MDP) such that itskips redundant timesteps of yellow light, speeding up learning by 30%; and weinvestigate the action space and provide insight into the difference inperformance between acyclic and cyclic phase transitions. Additionally, weprovide insights into the generalisation of the methods to unseen traffic.Evaluations using the real-world Hangzhou traffic dataset show that RLightoutperforms state-of-the-art rule-based and deep reinforcement learningalgorithms, demonstrating the potential of RL-based methods to improve urbantraffic flows.

Quick Read (beta)

loading the full paper ...