Safe Inverse Reinforcement Learning via Control Barrier Function

Abstract

Learning from Demonstration (LfD) is a powerful method for enabling robots toperform novel tasks as it is often more tractable for a non-roboticist end-userto demonstrate the desired skill and for the robot to efficiently learn fromthe associated data than for a human to engineer a reward function for therobot to learn the skill via reinforcement learning (RL). Safety issues arisein modern LfD techniques, e.g., Inverse Reinforcement Learning (IRL), just asthey do for RL; yet, safe learning in LfD has received little attention. In thecontext of agile robots, safety is especially vital due to the possibility ofrobot-environment collision, robot-human collision, and damage to the robot. Inthis paper, we propose a safe IRL framework, CBFIRL, that leverages the ControlBarrier Function (CBF) to enhance the safety of the IRL policy. The core ideaof CBFIRL is to combine a loss function inspired by CBF requirements with theobjective in an IRL method, both of which are jointly optimized via gradientdescent. In the experiments, we show our framework performs safer compared toIRL methods without CBF, that is $\sim15\%$ and $\sim20\%$ improvement for twolevels of difficulty of a 2D racecar domain and $\sim 50\%$ improvement for a3D drone domain.

Quick Read (beta)

loading the full paper ...