Analysis of Multi-Source Language Training in Cross-Lingual Transfer

Abstract

The successful adaptation of multilingual language models (LMs) to a specificlanguage-task pair critically depends on the availability of data tailored forthat condition. While cross-lingual transfer (XLT) methods have contributed toaddressing this data scarcity problem, there still exists ongoing debate aboutthe mechanisms behind their effectiveness. In this work, we focus on one ofpromising assumptions about inner workings of XLT, that it encouragesmultilingual LMs to place greater emphasis on language-agnostic ortask-specific features. We test this hypothesis by examining how the patternsof XLT change with a varying number of source languages involved in theprocess. Our experimental findings show that the use of multiple sourcelanguages in XLT-a technique we term Multi-Source Language Training(MSLT)-leads to increased mingling of embedding spaces for different languages,supporting the claim that XLT benefits from making use of language-independentinformation. On the other hand, we discover that using an arbitrary combinationof source languages does not always guarantee better performance. We suggestsimple heuristics for identifying effective language combinations for MSLT andempirically prove its effectiveness.

Quick Read (beta)

loading the full paper ...