OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Abstract

AI systems empowered by reinforcement learning (RL) algorithms harbor theimmense potential to catalyze societal advancement, yet their deployment isoften impeded by significant safety concerns. Particularly in safety-criticalapplications, researchers have raised concerns about unintended harms or unsafebehaviors of unaligned RL agents. The philosophy of safe reinforcement learning(SafeRL) is to align RL agents with harmless intentions and safe behavioralpatterns. In SafeRL, agents learn to develop optimal policies by receivingfeedback from the environment, while also fulfilling the requirement ofminimizing the risk of unintended harm or unsafe behavior. However, due to theintricate nature of SafeRL algorithm implementation, combining methodologiesacross various domains presents a formidable challenge. This had led to anabsence of a cohesive and efficacious learning framework within thecontemporary SafeRL research milieu. In this work, we introduce a foundationalframework designed to expedite SafeRL research endeavors. Our comprehensiveframework encompasses an array of algorithms spanning different RL domains andplaces heavy emphasis on safety elements. Our efforts are to make theSafeRL-related research process more streamlined and efficient, thereforefacilitating further research in AI safety. Our project is released at:https://github.com/PKU-Alignment/omnisafe.

Quick Read (beta)

loading the full paper ...