Abstract
Pretrained language models (PLMs) have shown remarkable generalization towardmultiple tasks and languages. Nonetheless, the generalization of PLMs towardsunseen languages is poor, resulting in significantly worse languageperformance, or even generating nonsensical responses that are comparable to arandom baseline. This limitation has been a longstanding problem of PLMsraising the problem of diversity and equal access to language modelingtechnology. In this work, we solve this limitation by introducing LinguAlchemy,a regularization technique that incorporates various aspects of languagescovering typological, geographical, and phylogenetic constraining the resultingrepresentation of PLMs to better characterize the corresponding linguisticsconstraints. LinguAlchemy significantly improves the accuracy performance ofmBERT and XLM-R on unseen languages by ~18% and ~2%, respectively compared tofully finetuned models and displaying a high degree of unseen languagegeneralization. We further introduce AlchemyScale and AlchemyTune, extension ofLinguAlchemy which adjusts the linguistic regularization weights automatically,alleviating the need for hyperparameter search. LinguAlchemy enables bettercross-lingual generalization to unseen languages which is vital for betterinclusivity and accessibility of PLMs.