Towards Safe Continuing Task Reinforcement Learning

  • 2021-02-24 22:12:25
  • Miguel Calvo-Fullana, Luiz F. O. Chamon, Santiago Paternain
  • 0

Abstract

Safety is a critical feature of controller design for physical systems. Whendesigning control policies, several approaches to guarantee this aspect ofautonomy have been proposed, such as robust controllers or control barrierfunctions. However, these solutions strongly rely on the model of the systembeing available to the designer. As a parallel development, reinforcementlearning provides model-agnostic control solutions but in general, it lacks thetheoretical guarantees required for safety. Recent advances show that undermild conditions, control policies can be learned via reinforcement learning,which can be guaranteed to be safe by imposing these requirements asconstraints of an optimization problem. However, to transfer from learningsafety to learning safely, there are two hurdles that need to be overcome: (i)it has to be possible to learn the policy without having to re-initialize thesystem; and (ii) the rollouts of the system need to be in themselves safe. Inthis paper, we tackle the first issue, proposing an algorithm capable ofoperating in the continuing task setting without the need of restarts. Weevaluate our approach in a numerical example, which shows the capabilities ofthe proposed approach in learning safe policies via safe exploration.

 

Quick Read (beta)

loading the full paper ...