Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement Learning

Abstract

Safety guarantee is essential in many engineering implementations.Reinforcement learning provides a useful way to strengthen safety. However,reinforcement learning algorithms cannot completely guarantee safety overrealistic operations. To address this issue, this work adopts control barrierfunctions over reinforcement learning, and proposes a compensated algorithm tocompletely maintain safety. Specifically, a sum-of-squares programming has beenexploited to search for the optimal controller, and tune the learninghyperparameters simultaneously. Thus, the control actions are pledged to bealways within the safe region. The effectiveness of proposed method isdemonstrated via an inverted pendulum model. Compared to quadratic programmingbased reinforcement learning methods, our sum-of-squares programming basedreinforcement learning has shown its superiority.

Quick Read (beta)

loading the full paper ...