On Certifying and Improving Generalization to Unseen Domains

Abstract

Domain Generalization (DG) aims to learn models whose performance remainshigh on unseen domains encountered at test-time by using data from multiplerelated source domains. Many existing DG algorithms reduce the divergencebetween source distributions in a representation space to potentially align theunseen domain close to the sources. This is motivated by the analysis thatexplains generalization to unseen domains using distributional distance (suchas the Wasserstein distance) to the sources. However, due to the openness ofthe DG objective, it is challenging to evaluate DG algorithms comprehensivelyusing a few benchmark datasets. In particular, we demonstrate that the accuracyof the models trained with DG methods varies significantly across unseendomains, generated from popular benchmark datasets. This highlights that theperformance of DG methods on a few benchmark datasets may not be representativeof their performance on unseen domains in the wild. To overcome this roadblock,we propose a universal certification framework based on distributionally robustoptimization (DRO) that can efficiently certify the worst-case performance ofany DG method. This enables a data-independent evaluation of a DG methodcomplementary to the empirical evaluations on benchmark datasets. Furthermore,we propose a training algorithm that can be used with any DG method to provablyimprove their certified performance. Our empirical evaluation demonstrates theeffectiveness of our method at significantly improving the worst-case loss(i.e., reducing the risk of failure of these models in the wild) withoutincurring a significant performance drop on benchmark datasets.

Quick Read (beta)

loading the full paper ...