Computational-Statistical Tradeoffs from NP-hardness

Abstract

A central question in computer science and statistics is whether efficientalgorithms can achieve the information-theoretic limits of statisticalproblems. Many computational-statistical tradeoffs have been shown underaverage-case assumptions, but since statistical problems are average-case innature, it has been a challenge to base them on standard worst-caseassumptions. In PAC learning where such tradeoffs were first studied, the question iswhether computational efficiency can come at the cost of using more samplesthan information-theoretically necessary. We base such tradeoffs on$\mathsf{NP}$-hardness and obtain: $\circ$ Sharp computational-statistical tradeoffs assuming $\mathsf{NP}$requires exponential time: For every polynomial $p(n)$, there is an $n$-variateclass $C$ with VC dimension $1$ such that the sample complexity oftime-efficiently learning $C$ is $\Theta(p(n))$. $\circ$ A characterization of $\mathsf{RP}$ vs. $\mathsf{NP}$ in terms oflearning: $\mathsf{RP} = \mathsf{NP}$ iff every $\mathsf{NP}$-enumerable classis learnable with $O(\mathrm{VCdim}(C))$ samples in polynomial time. Theforward implication has been known since (Pitt and Valiant, 1988); we prove thereverse implication. Notably, all our lower bounds hold against improper learners. These are thefirst $\mathsf{NP}$-hardness results for improperly learning a subclass ofpolynomial-size circuits, circumventing formal barriers of Applebaum, Barak,and Xiao (2008).

Quick Read (beta)

loading the full paper ...