Affine Invariant Covariance Estimation for Heavy-Tailed Distributions

  • 2019-02-08 14:13:24
  • Dmitrii Ostrovskii, Alessandro Rudi
In this work we provide an estimator for the covariance matrix of aheavy-tailed random vector. We prove that the proposed estimator$\widehat{\mathbf{S}}$ admits \textit{affine-invariant} bounds of the form$$(1-\varepsilon) \mathbf{S} \preccurlyeq \widehat{\mathbf{S}} \preccurlyeq(1+\varepsilon) \mathbf{S}$$in high probability, where $\mathbf{S}$ is theunknown covariance matrix, and $\preccurlyeq$ is the positive semidefiniteorder on symmetric matrices. The result only requires the existence offourth-order moments, and allows for $\varepsilon = O(\sqrt{\kappa^4 d/n})$where $\kappa^4$ is some measure of kurtosis of the distribution, $d$ is thedimensionality of the space, and $n$ is the sample size. More generally, we canallow for regularization with level~$\lambda$, then $\varepsilon$ depends onthe degrees of freedom number which is generally smaller than $d$. Thecomputational cost of the proposed estimator is essentially~$O(d^2 n + d^3)$,comparable to the computational cost of the sample covariance matrix in thestatistically interesting regime~$n \gg d$. Its applications to eigenvalueestimation with relative error and to ridge regression with heavy-tailed randomdesign are discussed.


