Adversarially Robust Low Dimensional Representations

Abstract

Adversarial or test time robustness measures the susceptibility of a machinelearning system to small perturbations made to the input at test time. This hasattracted much interest on the empirical side, since many existing ML systemsperform poorly under imperceptible adversarial perturbations to the testinputs. On the other hand, our theoretical understanding of this phenomenon islimited, and has mostly focused on supervised learning tasks. In this work we study the problem of computing adversarially robustrepresentations of data. We formulate a natural extension of PrincipalComponent Analysis (PCA) where the goal is to find a low dimensional subspaceto represent the given data with minimum projection error, and that is inaddition robust to small perturbations measured in $\ell_q$ norm (say$q=\infty$). Unlike PCA which is solvable in polynomial time, our formulationis computationally intractable to optimize as it captures the well-studiedsparse PCA objective. We show the following algorithmic and statistical results. - Polynomial time algorithms in the worst-case that achieve constant factorapproximations to the objective while only violating the robustness constraintby a constant factor. - We prove that our formulation (and algorithms) also enjoy significantstatistical benefits in terms of sample complexity over standard PCA on accountof a "regularization effect", that is formalized using the well-studied spikedcovariance model. - Surprisingly, we show that our algorithmic techniques can also be maderobust to corruptions in the training data, in addition to yieldingrepresentations that are robust at test time! Here an adversary is allowed tocorrupt potentially every data point up to a specified amount in the $\ell_q$norm. We further apply these techniques for mean estimation and clusteringunder adversarial corruptions to the training data.

Quick Read (beta)

loading the full paper ...