Abstract
In this paper, we investigate the use of global information to speed up thelearning process and increase the cumulative rewards of multi-agentreinforcement learning (MARL) tasks. Within the actor-critic MARL, we introducemultiple cooperative critics from two levels of the hierarchy and propose ahierarchical critic-based multi-agent reinforcement learning algorithm. In ourapproach, the agent is allowed to receive information from local and globalcritics in a competition task. The agent not only receives low-level detailsbut also consider coordination from high levels that receiving globalinformation to increase operation skills. Here, we define multiple cooperativecritics in the top-bottom hierarchy, called the Hierarchical Critics Assignment(HCA) framework. Our experiment, a two-player tennis competition task in theUnity environment, tested HCA multi-agent framework based on AsynchronousAdvantage Actor-Critic (A3C) with Proximal Policy Optimization (PPO) algorithm.The results showed that the HCA- framework outperforms the non-hierarchicalcritics baseline method for MARL tasks.