Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance

Abstract

Multi-Agent Reinforcement Learning (MARL) struggles with sample inefficiencyand poor generalization [1]. These challenges are partially due to a lack ofstructure or inductive bias in the neural networks typically used in learningthe policy. One such form of structure that is commonly observed in multi-agentscenarios is symmetry. The field of Geometric Deep Learning has developedEquivariant Graph Neural Networks (EGNN) that are equivariant (or symmetric) torotations, translations, and reflections of nodes. Incorporating equivariancehas been shown to improve learning efficiency and decrease error [ 2 ]. In thispaper, we demonstrate that EGNNs improve the sample efficiency andgeneralization in MARL. However, we also show that a naive application of EGNNsto MARL results in poor early exploration due to a bias in the EGNN structure.To mitigate this bias, we present Exploration-enhanced Equivariant Graph NeuralNetworks or E2GN2. We compare E2GN2 to other common function approximatorsusing common MARL benchmarks MPE and SMACv2. E2GN2 demonstrates a significantimprovement in sample efficiency, greater final reward convergence, and a 2x-5xgain in over standard GNNs in our generalization tests. These results pave theway for more reliable and effective solutions in complex multi-agent systems.

Quick Read (beta)

loading the full paper ...