Abstract
Benchmarks play a crucial role in the development and analysis ofreinforcement learning (RL) algorithms, with environment availability stronglyimpacting research. One particularly underexplored intersection is continuallearning (CL) in cooperative multi-agent settings. To remedy this, we introduceMEAL (Multi-agent Environments for Adaptive Learning), the first benchmarktailored for continual multi-agent reinforcement learning (CMARL). Existing CLbenchmarks run environments on the CPU, leading to computational bottlenecksand limiting the length of task sequences. MEAL leverages JAX for GPUacceleration, enabling continual learning across sequences of 100 tasks on astandard desktop PC in a few hours. We show that naively combining popular CLand MARL methods yields strong performance on simple environments, but fails toscale to more complex settings requiring sustained coordination and adaptation.Our ablation study identifies architectural and algorithmic features criticalfor CMARL on MEAL.