JaxMARL: Multi-Agent RL Environments in JAX

Abstract

Benchmarks play an important role in the development of machine learningalgorithms. For example, research in reinforcement learning (RL) has beenheavily influenced by available environments and benchmarks. However, RLenvironments are traditionally run on the CPU, limiting their scalability withtypical academic compute. Recent advancements in JAX have enabled the wider useof hardware acceleration to overcome these computational hurdles, enablingmassively parallel RL training pipelines and environments. This is particularlyuseful for multi-agent reinforcement learning (MARL) research. First of all,multiple agents must be considered at each environment step, addingcomputational burden, and secondly, the sample complexity is increased due tonon-stationarity, decentralised partial observability, or other MARLchallenges. In this paper, we present JaxMARL, the first open-source code basethat combines ease-of-use with GPU enabled efficiency, and supports a largenumber of commonly used MARL environments as well as popular baselinealgorithms. When considering wall clock time, our experiments show that per-runour JAX-based training pipeline is up to 12500x faster than existingapproaches. This enables efficient and thorough evaluations, with the potentialto alleviate the evaluation crisis of the field. We also introduce andbenchmark SMAX, a vectorised, simplified version of the popular StarCraftMulti-Agent Challenge, which removes the need to run the StarCraft II gameengine. This not only enables GPU acceleration, but also provides a moreflexible MARL environment, unlocking the potential for self-play,meta-learning, and other future applications in MARL. We provide code athttps://github.com/flairox/jaxmarl.

Quick Read (beta)

loading the full paper ...