Abstract
Adapters are light-weight modules that allow parameter-efficient fine-tuningof pretrained models. Specialized language and task adapters have recently beenproposed to facilitate cross-lingual transfer of multilingual pretrained models(Pfeiffer et al., 2020b). However, this approach requires training a separatelanguage adapter for every language one wishes to support, which can beimpractical for languages with limited data. An intuitive solution is to use arelated language adapter for the new language variety, but we observe that thissolution can lead to sub-optimal performance. In this paper, we aim to improvethe robustness of language adapters to uncovered languages without training newadapters. We find that ensembling multiple existing language adapters makes thefine-tuned model significantly more robust to other language varieties notincluded in these adapters. Building upon this observation, we propose EntropyMinimized Ensemble of Adapters (EMEA), a method that optimizes the ensembleweights of the pretrained language adapters for each test sentence byminimizing the entropy of its predictions. Experiments on three diverse groupsof language varieties show that our method leads to significant improvements onboth named entity recognition and part-of-speech tagging across all languages.