An Efficient Approach for Studying Cross-Lingual Transfer in Multilingual Language Models

Abstract

The capacity and effectiveness of pre-trained multilingual models (MLMs) forzero-shot cross-lingual transfer is well established. However, phenomena ofpositive or negative transfer, and the effect of language choice still need tobe fully understood, especially in the complex setting of massivelymultilingual LMs. We propose an \textit{efficient} method to study transferlanguage influence in zero-shot performance on another target language. Unlikeprevious work, our approach disentangles downstream tasks from language, usingdedicated adapter units. Our findings suggest that some languages do notlargely affect others, while some languages, especially ones unseen duringpre-training, can be extremely beneficial or detrimental for different targetlanguages. We find that no transfer language is beneficial for all targetlanguages. We do, curiously, observe languages previously unseen by MLMsconsistently benefit from transfer from almost any language. We additionallyuse our modular approach to quantify negative interference efficiently andcategorize languages accordingly. Furthermore, we provide a list of promisingtransfer-target language configurations that consistently lead to targetlanguage performance improvements. Code and data are publicly available:https://github.com/ffaisal93/neg_inf

Quick Read (beta)

loading the full paper ...