Prioritized League Reinforcement Learning for Large-Scale Heterogeneous Multiagent Systems

Abstract

Large-scale heterogeneous multiagent systems feature various realisticfactors in the real world, such as agents with diverse abilities and overallsystem cost. In comparison to homogeneous systems, heterogeneous systems offersignificant practical advantages. Nonetheless, they also present challenges formultiagent reinforcement learning, including addressing the non-stationaryproblem and managing an imbalanced number of agents with different types. Wepropose a Prioritized Heterogeneous League Reinforcement Learning (PHLRL)method to address large-scale heterogeneous cooperation problems. PHLRLmaintains a record of various policies that agents have explored during theirtraining and establishes a heterogeneous league consisting of diverse policiesto aid in future policy optimization. Furthermore, we design a prioritizedpolicy gradient approach to compensate for the gap caused by differences in thenumber of different types of agents. Next, we use Unreal Engine to design alarge-scale heterogeneous cooperation benchmark named Large-Scale MultiagentOperation (LSMO), which is a complex two-team competition scenario thatrequires collaboration from both ground and airborne agents. We use experimentsto show that PHLRL outperforms state-of-the-art methods, including QTRAN andQPLEX in LSMO.

Quick Read (beta)

loading the full paper ...