Containerized Distributed Value-Based Multi-Agent Reinforcement Learning

Abstract

Multi-agent reinforcement learning tasks put a high demand on the volume oftraining samples. Different from its single-agent counterpart, distributedvalue-based multi-agent reinforcement learning faces the unique challenges ofdemanding data transfer, inter-process communication management, and highrequirement of exploration. We propose a containerized learning framework tosolve these problems. We pack several environment instances, a local learnerand buffer, and a carefully designed multi-queue manager which avoids blockinginto a container. Local policies of each container are encouraged to be asdiverse as possible, and only trajectories with highest priority are sent to aglobal learner. In this way, we achieve a scalable, time-efficient, and diversedistributed MARL learning framework with high system throughput. To ownknowledge, our method is the first to solve the challenging Google ResearchFootball full game $5\_v\_5$. On the StarCraft II micromanagement benchmark,our method gets $4$-$18\times$ better results compared to state-of-the-artnon-distributed MARL algorithms.

Quick Read (beta)

loading the full paper ...