Abstract
Reinforcement learning (RL) controllers are flexible and performant butrarely guarantee safety. Safety filters impart hard safety guarantees to RLcontrollers while maintaining flexibility. However, safety filters can causeundesired behaviours due to the separation between the controller and thesafety filter, often degrading performance and robustness. In this paper, weanalyze several modifications to incorporating the safety filter in training RLcontrollers rather than solely applying it during evaluation. The modificationsallow the RL controller to learn to account for the safety filter, improvingperformance. This paper presents a comprehensive analysis of training RL withsafety filters, featuring simulated and real-world experiments with a Crazyflie2.0 drone. We examine how various training modifications and hyperparametersimpact performance, sample efficiency, safety, and chattering. Our findingsserve as a guide for practitioners and researchers focused on safety filtersand safe RL.