When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks

Abstract

Attacks against machine learning systems represent a growing threat ashighlighted by the abundance of attacks proposed lately. However, attacks oftenmake unrealistic assumptions about the knowledge and capabilities ofadversaries. To evaluate this threat systematically, we propose the FAILattacker model, which describes the adversary's knowledge and control alongfour dimensions. The FAIL model allows us to consider a wide range of weakeradversaries that have limited control and incomplete knowledge of the features,learning algorithms and training instances utilized. Within this framework, weevaluate the generalized transferability of a known evasion attack and wedesign StingRay, a targeted poisoning attack that is broadly applicable---it ispractical against 4 machine learning applications, which use 3 differentlearning algorithms, and it can bypass 2 existing defenses. Our evaluationprovides deeper insights into the transferability of poison and evasion samplesacross models and suggests promising directions for investigating defensesagainst this threat.

Quick Read (beta)

loading the full paper ...