Belief States for Cooperative Multi-Agent Reinforcement Learning under Partial Observability

Abstract

Reinforcement learning in partially observable environments is typicallychallenging, as it requires agents to learn an estimate of the underlyingsystem state. These challenges are exacerbated in multi-agent settings, whereagents learn simultaneously and influence the underlying state as well as eachothers' observations. We propose the use of learned beliefs on the underlyingstate of the system to overcome these challenges and enable reinforcementlearning with fully decentralized training and execution. Our approachleverages state information to pre-train a probabilistic belief model in aself-supervised fashion. The resulting belief states, which capture bothinferred state information as well as uncertainty over this information, arethen used in a state-based reinforcement learning algorithm to create anend-to-end model for cooperative multi-agent reinforcement learning underpartial observability. By separating the belief and reinforcement learningtasks, we are able to significantly simplify the policy and value functionlearning tasks and improve both the convergence speed and the finalperformance. We evaluate our proposed method on diverse partially observablemulti-agent tasks designed to exhibit different variants of partialobservability.

Quick Read (beta)

loading the full paper ...