Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based on Maximum Entropy

Abstract

Multi-agent deep reinforcement learning has been applied to address a varietyof complex problems with either discrete or continuous action spaces andachieved great success. However, most real-world environments cannot bedescribed by only discrete action spaces or only continuous action spaces. Andthere are few works having ever utilized deep reinforcement learning (drl) tomulti-agent problems with hybrid action spaces. Therefore, we propose a novelalgorithm: Deep Multi-Agent Hybrid Soft Actor-Critic (MAHSAC) to fill this gap.This algorithm follows the centralized training but decentralized execution(CTDE) paradigm, and extend the Soft Actor-Critic algorithm (SAC) to handlehybrid action space problems in Multi-Agent environments based on maximumentropy. Our experiences are running on an easy multi-agent particle world witha continuous observation and discrete action space, along with some basicsimulated physics. The experimental results show that MAHSAC has goodperformance in training speed, stability, and anti-interference ability. At thesame time, it outperforms existing independent deep hybrid learning method incooperative scenarios and competitive scenarios.

Quick Read (beta)

loading the full paper ...