Abstract
The growing threat of low-cost kamikaze drone swarms poses a criticalchallenge to modern defense systems demanding rapid and strategicdecision-making to prioritize interceptions across multiple effectors andhigh-value target zones. In this work, we present a case study demonstratingthe practical advantages of reinforcement learning in addressing thischallenge. We introduce a high-fidelity simulation environment that capturesrealistic operational constraints, within which a decision-level reinforcementlearning agent learns to coordinate multiple effectors for optimal interceptionprioritization. Operating in a discrete action space, the agent selects whichdrone to engage per effector based on observed state features such aspositions, classes, and effector status. We evaluate the learned policy againsta handcrafted rule-based baseline across hundreds of simulated attackscenarios. The reinforcement learning based policy consistently achieves loweraverage damage and higher defensive efficiency in protecting critical zones.This case study highlights the potential of reinforcement learning as astrategic layer within defense architectures, enhancing resilience withoutdisplacing existing control systems. All code and simulation assets arepublicly released for full reproducibility, and a video demonstrationillustrates the policy's qualitative behavior.