Formal Bayesian Transfer Learning via the Total Risk Prior

Abstract

In analyses with severe data-limitations, augmenting the target dataset withinformation from ancillary datasets in the application domain, called sourcedatasets, can lead to significantly improved statistical procedures. However,existing methods for this transfer learning struggle to deal with situationswhere the source datasets are also limited and not guaranteed to bewell-aligned with the target dataset. A typical strategy is to use theempirical loss minimizer on the source data as a prior mean for the targetparameters, which places the estimation of source parameters outside of theBayesian formalism. Our key conceptual contribution is to use a risk minimizerconditional on source parameters instead. This allows us to construct a singlejoint prior distribution for all parameters from the source datasets as well asthe target dataset. As a consequence, we benefit from full Bayesian uncertaintyquantification and can perform model averaging via Gibbs sampling overindicator variables governing the inclusion of each source dataset. We show howa particular instantiation of our prior leads to a Bayesian Lasso in atransformed coordinate system and discuss computational techniques to scale ourapproach to moderately sized datasets. We also demonstrate that recentlyproposed minimax-frequentist transfer learning techniques may be viewed as anapproximate Maximum a Posteriori approach to our model. Finally, we demonstratesuperior predictive performance relative to the frequentist baseline on agenetics application, especially when the source data are limited.

Quick Read (beta)

loading the full paper ...