Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

Abstract

Large-scale screening for potential threats with limited resources andcapacity for screening is a problem of interest at airports, seaports, andother ports of entry. Adversaries can observe screening procedures and arriveat a time when there will be gaps in screening due to limited resourcecapacities. To capture this game between ports and adversaries, this problemhas been previously represented as a Stackelberg game, referred to as a ThreatScreening Game (TSG). Given the significant complexity associated with solvingTSGs and uncertainty in arrivals of customers, existing work has assumed thatscreenees arrive and are allocated security resources at the beginning of thetime window. In practice, screenees such as airport passengers arrive in burstscorrelated with flight time and are not bound by fixed time windows. To addressthis, we propose an online threat screening model in which screening strategyis determined adaptively as a passenger arrives while satisfying a hard boundon acceptable risk of not screening a threat. To solve the online problem witha hard bound on risk, we formulate it as a Reinforcement Learning (RL) problemwith constraints on the action space (hard bound on risk). We provide a novelway to efficiently enforce linear inequality constraints on the action outputin Deep Reinforcement Learning. We show that our solution allows us tosignificantly reduce screenee wait time while guaranteeing a bound on risk.

Quick Read (beta)

loading the full paper ...