Abstract
As network security threats continue to evolve, safeguarding Machine Learning(ML)-based Network Intrusion Detection Systems (NIDS) from adversarial attacksis crucial. This paper introduces the notion of feature perturb-ability andpresents a novel Perturb-ability Score (PS) metric that identifies NIDSfeatures susceptible to manipulation in the problem-space by an attacker. Byquantifying a feature's susceptibility to perturbations within theproblem-space, the PS facilitates the selection of features that are inherentlymore robust against evasion adversarial attacks on ML-NIDS during the featureselection phase. These features exhibit natural resilience to perturbations, asthey are heavily constrained by the problem-space limitations and correlationsof the NIDS domain. Furthermore, manipulating these features may either disruptthe malicious function of evasion adversarial attacks on NIDS or render thenetwork traffic invalid for processing (or both). This proposed novel approachemploys a fresh angle by leveraging network domain constraints as a defensemechanism against problem-space evasion adversarial attacks targeting ML-NIDS.We demonstrate the effectiveness of our PS-guided feature selection defense inenhancing NIDS robustness. Experimental results across various ML-based NIDSmodels and public datasets show that selecting only robust features (low-PSfeatures) can maintain solid detection performance while significantly reducingvulnerability to evasion adversarial attacks. Additionally, our findings verifythat the PS effectively identifies NIDS features highly vulnerable toproblem-space perturbations.