HiLight: A Hierarchical Reinforcement Learning Framework with Global Adversarial Guidance for Large-Scale Traffic Signal Control

Abstract

Efficient traffic signal control (TSC) is essential for mitigating urbancongestion, yet existing reinforcement learning (RL) methods face challenges inscaling to large networks while maintaining global coordination. Centralized RLsuffers from scalability issues, while decentralized approaches often lackunified objectives, resulting in limited network-level efficiency. In thispaper, we propose HiLight, a hierarchical reinforcement learning framework withglobal adversarial guidance for large-scale TSC. HiLight consists of ahigh-level Meta-Policy, which partitions the traffic network into subregionsand generates sub-goals using a Transformer-LSTM architecture, and a low-levelSub-Policy, which controls individual intersections with global awareness. Toimprove the alignment between global planning and local execution, we introducean adversarial training mechanism, where the Meta-Policy generates challengingyet informative sub-goals, and the Sub-Policy learns to surpass these targets,leading to more effective coordination. We evaluate HiLight across bothsynthetic and real-world benchmarks, and additionally construct a large-scaleManhattan network with diverse traffic conditions, including peak transitions,adverse weather, and holiday surges. Experimental results show that HiLightexhibits significant advantages in large-scale scenarios and remainscompetitive across standard benchmarks of varying sizes.

Quick Read (beta)

loading the full paper ...