Directional Adversarial Training for Cost Sensitive Deep Learning Classification Applications

Abstract

In many real-world applications of Machine Learning it is of paramountimportance not only to provide accurate predictions, but also to ensure certainlevels of robustness. Adversarial Training is a training procedure aiming atproviding models that are robust to worst-case perturbations around predefinedpoints. Unfortunately, one of the main issues in adversarial training is thatrobustness w.r.t. gradient-based attackers is always achieved at the cost ofprediction accuracy. In this paper, a new algorithm, called WassersteinProjected Gradient Descent (WPGD), for adversarial training is proposed. WPGDprovides a simple way to obtain cost-sensitive robustness, resulting in a finercontrol of the robustness-accuracy trade-off. Moreover, WPGD solves an optimaltransport problem on the output space of the network and it can efficientlydiscover directions where robustness is required, allowing to control thedirectional trade-off between accuracy and robustness. The proposed WPGD isvalidated in this work on image recognition tasks with different benchmarkdatasets and architectures. Moreover, real world-like datasets are oftenunbalanced: this paper shows that when dealing with such type of datasets, theperformance of adversarial training are mainly affected in term of standardaccuracy.

Quick Read (beta)

loading the full paper ...