Gradient-Guided Annealing for Domain Generalization

Abstract

Domain Generalization (DG) research has gained considerable traction as oflate, since the ability to generalize to unseen data distributions is arequirement that eludes even state-of-the-art training algorithms. In thispaper we observe that the initial iterations of model training play a key rolein domain generalization effectiveness, since the loss landscape may besignificantly different across the training and test distributions, contrary tothe case of i.i.d. data. Conflicts between gradients of the loss components ofeach domain lead the optimization procedure to undesirable local minima that donot capture the domain-invariant features of the target classes. We proposealleviating domain conflicts in model optimization, by iteratively annealingthe parameters of a model in the early stages of training and searching forpoints where gradients align between domains. By discovering a set of parametervalues where gradients are updated towards the same direction for each datadistribution present in the training set, the proposed Gradient-GuidedAnnealing (GGA) algorithm encourages models to seek out minima that exhibitimproved robustness against domain shifts. The efficacy of GGA is evaluated onfive widely accepted and challenging image classification domain generalizationbenchmarks, where its use alone is able to establish highly competitive or evenstate-of-the-art performance. Moreover, when combined with previously proposeddomain-generalization algorithms it is able to consistently improve theireffectiveness by significant margins.

Quick Read (beta)

loading the full paper ...