### Abstract

We consider learning in an adversarial environment, where an$\varepsilon$-fraction of samples from a distribution $P$ are arbitrarilymodified (*global* corruptions) and the remaining perturbations have averagemagnitude bounded by $\rho$ (*local* corruptions). Given access to $n$ suchcorrupted samples, we seek a computationally efficient estimator $\hat{P}_n$that minimizes the Wasserstein distance $\mathsf{W}_1(\hat{P}_n,P)$. In fact,we attack the fine-grained task of minimizing $\mathsf{W}_1(\Pi_\# \hat{P}_n,\Pi_\# P)$ for all orthogonal projections $\Pi \in \mathbb{R}^{d \times d}$,with performance scaling with $\mathrm{rank}(\Pi) = k$. This allows us toaccount simultaneously for mean estimation ($k=1$), distribution estimation($k=d$), as well as the settings interpolating between these two extremes. Wecharacterize the optimal population-limit risk for this task and then developan efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilonk} + \rho + d^{O(1)}\tilde{O}(n^{-1/k})$ when $P$ has bounded moments of order$2+\delta$, for constant $\delta > 0$. For data distributions with boundedcovariance, our finite-sample bounds match the minimax population-level optimumfor large sample sizes. Our efficient procedure relies on a novel trace normapproximation of an ideal yet intractable 2-Wasserstein projection estimator.We apply this algorithm to robust stochastic optimization, and, in the process,uncover a new method for overcoming the curse of dimensionality in Wassersteindistributionally robust optimization.