Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning

Abstract

We consider the problem of using reinforcement learning to train adversarialagents for automatic testing and falsification of cyberphysical systems, suchas autonomous vehicles, robots, and airplanes. In order to produce usefulagents, however, it is useful to be able to control the degree ofadversariality by specifying rules that an agent must follow. For example, whentesting an autonomous vehicle, it is useful to find maximally antagonistictraffic participants that obey traffic rules. We model dynamic constraints ashierarchically ordered rules expressed in Signal Temporal Logic, and show howthese can be incorporated into an agent training process. We prove that ouragent-centric approach is able to find all dangerous behaviors that can befound by traditional falsification techniques while producing modular andreusable agents. We demonstrate our approach on two case studies from theautomotive domain.

Quick Read (beta)

loading the full paper ...