Abstract
Reinforcement learning (RL) has achieved tremendous progress in solvingvarious sequential decision-making problems, e.g., control tasks in robotics.However, RL methods often fail to generalize to safety-critical scenarios sincepolicies are overfitted to training environments. Previously, robustadversarial reinforcement learning (RARL) was proposed to train an adversarialnetwork that applies disturbances to a system, which improves robustness intest scenarios. A drawback of neural-network-based adversaries is thatintegrating system requirements without handcrafting sophisticated rewardsignals is difficult. Safety falsification methods allow one to find a set ofinitial conditions as well as an input sequence, such that the system violatesa given property formulated in temporal logic. In this paper, we proposefalsification-based RARL (FRARL), the first generic framework for integratingtemporal-logic falsification in adversarial learning to improve policyrobustness. With falsification method, we do not need to construct an extrareward function for the adversary. We evaluate our approach on a brakingassistance system and an adaptive cruise control system of autonomous vehicles.Experiments show that policies trained with a falsification-based adversarygeneralize better and show less violation of the safety specification in testscenarios than the ones trained without an adversary or with an adversarialnetwork.