A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Abstract

The rapid recent progress in machine learning (ML) has raised a number ofscientific questions that challenge the longstanding dogma of the field. One ofthe most important riddles is the good empirical generalization ofoverparameterized models. Overparameterized models are excessively complex withrespect to the size of the training dataset, which results in them perfectlyfitting (i.e., interpolating) the training data, which is usually noisy. Suchinterpolation of noisy data is traditionally associated with detrimentaloverfitting, and yet a wide range of interpolating models -- from simple linearmodels to deep neural networks -- have recently been observed to generalizeextremely well on fresh test data. Indeed, the recently discovered doubledescent phenomenon has revealed that highly overparameterized models oftenimprove over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theoryand foundational empirical studies, even for the simplest case of the linearmodel. The underpinnings of this understanding have been laid in very recentanalyses of overparameterized linear regression and related statisticallearning tasks, which resulted in precise analytic characterizations of doubledescent. This paper provides a succinct overview of this emerging theory ofoverparameterized ML (henceforth abbreviated as TOPML) that explains theserecent findings through a statistical signal processing perspective. Weemphasize the unique aspects that define the TOPML research area as a subfieldof modern ML theory and outline interesting open questions that remain.

Quick Read (beta)

loading the full paper ...