Abstract
Graph data are inherently complex and heterogeneous, leading to a highnatural diversity of distributional shifts. However, it remains unclear how tobuild machine learning architectures that generalize to the complexdistributional shifts naturally occurring in the real world. Here, we developGraphMETRO, a Graph Neural Network architecture that models natural diversityand captures complex distributional shifts. GraphMETRO employs aMixture-of-Experts (MoE) architecture with a gating model and multiple expertmodels, where each expert model targets a specific distributional shift toproduce a referential representation w.r.t. a reference model, and the gatingmodel identifies shift components. Additionally, we design a novel objectivethat aligns the representations from different expert models to ensure reliableoptimization. GraphMETRO achieves state-of-the-art results on four datasetsfrom the GOOD benchmark, which is comprised of complex and natural real-worlddistribution shifts, improving by 67% and 4.2% on the WebKB and Twitchdatasets. Code and data are available at https://github.com/Wuyxin/GraphMETRO.