A Theory of Label Propagation for Subpopulation Shift

  • 2021-02-22 17:27:47
  • Tianle Cai, Ruiqi Gao, Jason D. Lee, Qi Lei
One of the central problems in machine learning is domain adaptation. Unlikepast theoretical work, we consider a new model for subpopulation shift in theinput or representation space. In this work, we propose a provably effectiveframework for domain adaptation based on label propagation. In our analysis, weuse a simple but realistic ``expansion'' assumption, proposed in\citet{wei2021theoretical}. Using a teacher classifier trained on the sourcedomain, our algorithm not only propagates to the target domain but alsoimproves upon the teacher. By leveraging existing generalization bounds, wealso obtain end-to-end finite-sample guarantees on the entire algorithm. Inaddition, we extend our theoretical framework to a more general setting ofsource-to-target transfer based on a third unlabeled dataset, which can beeasily applied in various learning scenarios.


