Improving Coordination in Small-Scale Multi-Agent Deep Reinforcement Learning through Memory-driven Communication

Abstract

Deep reinforcement learning algorithms have recently been used to trainmultiple interacting agents in a centralised manner whilst keeping theirexecution decentralised. When the agents can only acquire partial observationsand are faced with tasks requiring coordination and synchronisation skills,inter-agent communication plays an essential role. In this work, we propose aframework for multi-agent training using deep deterministic policy gradientsthat enables concurrent, end-to-end learning of an explicit communicationprotocol through a memory device. During training, the agents learn to performread and write operations enabling them to infer a shared representation of theworld. We empirically demonstrate that concurrent learning of the communicationdevice and individual policies can improve inter-agent coordination andperformance in small-scale systems. Our experimental results show that theproposed method achieves superior performance in scenarios with up to sixagents. We illustrate how different communication patterns can emerge on sixdifferent tasks of increasing complexity. Furthermore, we study the effects ofcorrupting the communication channel, provide a visualisation of thetime-varying memory content as the underlying task is being solved and validatethe building blocks of the proposed memory device through ablation studies.

Quick Read (beta)

loading the full paper ...