Delay-Aware Multi-Agent Reinforcement Learning

Abstract

Action and observation delays exist prevalently in the real-worldcyber-physical systems which may pose challenges in reinforcement learningdesign. It is particularly an arduous task when handling multi-agent systemswhere the delay of one agent could spread to other agents. To resolve thisproblem, this paper proposes a novel framework to deal with delays as well asthe non-stationary training issue of multi-agent tasks with model-free deepreinforcement learning. We formally define the Delay-Aware Markov Game thatincorporates the delays of all agents in the environment. To solve Delay-AwareMarkov Games, we apply centralized training and decentralized execution thatallows agents to use extra information to ease the non-stationary issue of themulti-agent systems during training, without the need of a centralizedcontroller during execution. Experiments are conducted in multi-agent particleenvironments including cooperative communication, cooperative navigation, andcompetitive experiments. We also test the proposed algorithm in trafficscenarios that require coordination of all autonomous vehicles to show thepractical value of delay-awareness. Results show that the proposed delay-awaremulti-agent reinforcement learning algorithm greatly alleviates the performancedegradation introduced by delay. Codes available at:https://github.com/baimingc/damarl.

Quick Read (beta)

loading the full paper ...