Voting-Based Multi-Agent Reinforcement Learning

Abstract

The recent success of single-agent reinforcement learning (RL) encourages theexploration of multi-agent reinforcement learning (MARL), which is morechallenging due to the interactions among different agents. In this paper, weconsider a voting-based MARL problem, in which the agents vote to make groupdecisions and the goal is to maximize the globally averaged returns. To thisend, we formulate the MARL problem based on the linear programming form of thepolicy optimization problem and propose a distributed primal-dual algorithm toobtain the optimal solution. We also propose a voting mechanism through whichthe distributed learning achieves the same sub-linear convergence rate ascentralized learning. In other words, the distributed decision making does notslow down the global consensus to optimal. We also verify the convergence ofour proposed algorithm with numerical simulations and conduct case studies inpractical multi-agent systems.

Quick Read (beta)

loading the full paper ...