Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Abstract

Artificial intelligence (AI) systems possess significant potential to drivesocietal progress. However, their deployment often faces obstacles due tosubstantial safety concerns. Safe reinforcement learning (SafeRL) emerges as asolution to optimize policies while simultaneously adhering to multipleconstraints, thereby addressing the challenge of integrating reinforcementlearning in safety-critical scenarios. In this paper, we present an environmentsuite called Safety-Gymnasium, which encompasses safety-critical tasks in bothsingle and multi-agent scenarios, accepting vector and vision-only input.Additionally, we offer a library of algorithms named Safe Policy Optimization(SafePO), comprising 16 state-of-the-art SafeRL algorithms. This comprehensivelibrary can serve as a validation tool for the research community. Byintroducing this benchmark, we aim to facilitate the evaluation and comparisonof safety performance, thus fostering the development of reinforcement learningfor safer, more reliable, and responsible real-world applications. The websiteof this project can be accessed athttps://sites.google.com/view/safety-gymnasium.

Quick Read (beta)

loading the full paper ...