A Theory of Label Propagation for Subpopulation Shift

Abstract

One of the central problems in machine learning is domain adaptation. Unlikepast theoretical work, we consider a new model for subpopulation shift in theinput or representation space. In this work, we propose a provably effectiveframework for domain adaptation based on label propagation. In our analysis, weuse a simple but realistic ``expansion'' assumption, proposed in\citet{wei2021theoretical}. Using a teacher classifier trained on the sourcedomain, our algorithm not only propagates to the target domain but alsoimproves upon the teacher. By leveraging existing generalization bounds, wealso obtain end-to-end finite-sample guarantees on the entire algorithm. Inaddition, we extend our theoretical framework to a more general setting ofsource-to-target transfer based on a third unlabeled dataset, which can beeasily applied in various learning scenarios.

Quick Read (beta)

loading the full paper ...