Abstract
We develop and analyze a principled approach to kernel ridge regression undercovariate shift. The goal is to learn a regression function with small meansquared error over a target distribution, based on unlabeled data from thereand labeled data that may have a different feature distribution. We propose tosplit the labeled data into two subsets, and conduct kernel ridge regression onthem separately to obtain a collection of candidate models and an imputationmodel. We use the latter to fill the missing labels and then select the bestcandidate accordingly. Our non-asymptotic excess risk bounds demonstrate thatour estimator adapts effectively to both the structure of the targetdistribution and the covariate shift. This adaptation is quantified through anotion of effective sample size that reflects the value of labeled source datafor the target regression task. Our estimator achieves the minimax optimalerror rate up to a polylogarithmic factor, and we find that using pseudo-labelsfor model selection does not significantly hinder performance.