Multi-Agent Deep Reinforcement Learning for Cooperative Connected Vehicles

Abstract

Millimeter-wave (mmWave) base station can offer abundant high capacitychannel resources toward connected vehicles so that quality-of-service (QoS) ofthem in terms of downlink throughput can be highly improved. The mmWave basestation can operate among existing base stations (e.g., macro-cell basestation) on non-overlapped channels among them and the vehicles can makedecision what base station to associate, and what channel to utilize onheterogeneous networks. Furthermore, because of the non-omni property of mmWavecommunication, the vehicles decide how to align the beam direction towardmmWave base station to associate with it. However, such joint problem requireshigh computational cost, which is NP-hard and has combinatorial features. Inthis paper, we solve the problem in 3-tier heterogeneous vehicular network(HetVNet) with multi-agent deep reinforcement learning (DRL) in a way thatmaximizes expected total reward (i.e., downlink throughput) of vehicles. Themulti-agent deep deterministic policy gradient (MADDPG) approach is introducedto achieve optimal policy in continuous action domain.

Quick Read (beta)

loading the full paper ...