Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics

Abstract

Multi-Agent Reinforcement Learning (MARL) has gained significant traction forsolving complex real-world tasks, but the inherent stochasticity anduncertainty in these environments pose substantial challenges to efficient androbust policy learning. While Distributional Reinforcement Learning has beensuccessfully applied in single-agent settings to address risk and uncertainty,its application in MARL is substantially limited. In this work, we propose anovel approach that integrates distributional learning with a safety-focusedloss function to improve convergence in cooperative MARL tasks. Specifically,we introduce a Barrier Function based loss that leverages safety metrics,identified from inherent faults in the system, into the policy learningprocess. This additional loss term helps mitigate risks and encourages saferexploration during the early stages of training. We evaluate our method in theStarCraft II micromanagement benchmark, where our approach demonstratesimproved convergence and outperforms state-of-the-art baselines in terms ofboth safety and task completion. Our results suggest that incorporating safetyconsiderations can significantly enhance learning performance in complex,multi-agent environments.

Quick Read (beta)

loading the full paper ...