Cross-lingual transfer of multilingual models on low resource African Languages

Abstract

Large multilingual models have significantly advanced natural languageprocessing (NLP) research. However, their high resource demands and potentialbiases from diverse data sources have raised concerns about their effectivenessacross low-resource languages. In contrast, monolingual models, trained on asingle language, may better capture the nuances of the target language,potentially providing more accurate results. This study benchmarks thecross-lingual transfer capabilities from a high-resource language to alow-resource language for both, monolingual and multilingual models, focusingon Kinyarwanda and Kirundi, two Bantu languages. We evaluate the performance oftransformer based architectures like Multilingual BERT (mBERT), AfriBERT, andBantuBERTa against neural-based architectures such as BiGRU, CNN, and char-CNN.The models were trained on Kinyarwanda and tested on Kirundi, with fine-tuningapplied to assess the extent of performance improvement and catastrophicforgetting. AfriBERT achieved the highest cross-lingual accuracy of 88.3% afterfine-tuning, while BiGRU emerged as the best-performing neural model with 83.3%accuracy. We also analyze the degree of forgetting in the original languagepost-fine-tuning. While monolingual models remain competitive, this studyhighlights that multilingual models offer strong cross-lingual transfercapabilities in resource limited settings.

Quick Read (beta)

loading the full paper ...