A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning

Abstract

Multiagent reinforcement learning (MARL) can solve complex cooperative tasks.However, the efficiency of existing MARL methods relies heavily on well-definedreward functions. Multiagent tasks with sparse reward feedback are especiallychallenging not only because of the credit distribution problem, but also dueto the low probability of obtaining positive reward feedback. In this paper, wedesign a graph network called Cooperation Graph (CG). The Cooperation Graph isthe combination of two simple bipartite graphs, namely, the Agent Clusteringsubgraph (ACG) and the Cluster Designating subgraph (CDG). Next, based on thisnovel graph structure, we propose a Cooperation Graph Multiagent ReinforcementLearning (CG-MARL) algorithm, which can efficiently deal with the sparse rewardproblem in multiagent tasks. In CG-MARL, agents are directly controlled by theCooperation Graph. And a policy neural network is trained to manipulate thisCooperation Graph, guiding agents to achieve cooperation in an implicit way.This hierarchical feature of CG-MARL provides space for customizedcluster-actions, an extensible interface for introducing fundamentalcooperation knowledge. In experiments, CG-MARL shows state-of-the-artperformance in sparse reward multiagent benchmarks, including the anti-invasioninterception task and the multi-cargo delivery task.

Quick Read (beta)

loading the full paper ...