Domain Generalization via Gradient Surgery

Abstract

In real-life applications, machine learning models often face scenarios wherethere is a change in data distribution between training and test domains. Whenthe aim is to make predictions on distributions different from those seen attraining, we incur in a domain generalization problem. Methods to address thisissue learn a model using data from multiple source domains, and then applythis model to the unseen target domain. Our hypothesis is that when trainingwith multiple domains, conflicting gradients within each mini-batch containinformation specific to the individual domains which is irrelevant to theothers, including the test domain. If left untouched, such disagreement maydegrade generalization performance. In this work, we characterize theconflicting gradients emerging in domain shift scenarios and devise novelgradient agreement strategies based on gradient surgery to alleviate theireffect. We validate our approach in image classification tasks with threemulti-domain datasets, showing the value of the proposed agreement strategy inenhancing the generalization capability of deep learning models in domain shiftscenarios.

Quick Read (beta)

loading the full paper ...