Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

Abstract

We study the transfer learning process between two linear regressionproblems. An important and timely special case is when the regressors areoverparameterized and perfectly interpolate their training data. We examine aparameter transfer mechanism whereby a subset of the parameters of the targettask solution are constrained to the values learned for a related source task.We analytically characterize the generalization error of the target task interms of the salient factors in the transfer learning architecture, i.e., thenumber of examples available, the number of (free) parameters in each of thetasks, the number of parameters transferred from the source to target task, andthe correlation between the two tasks. Our non-asymptotic analysis shows thatthe generalization error of the target task follows a two-dimensional doubledescent trend (with respect to the number of free parameters in each of thetasks) that is controlled by the transfer learning factors. Our analysis pointsto specific cases where the transfer of parameters is beneficial.

Quick Read (beta)

loading the full paper ...