Abstract
Pretrained multilingual models have become a de facto default approach forzero-shot cross-lingual transfer. Previous work has shown that these models areable to achieve cross-lingual representations when pretrained on two or morelanguages with shared parameters. In this work, we provide evidence that amodel can achieve language-agnostic representations even when pretrained on asingle language. That is, we find that monolingual models pretrained andfinetuned on different languages achieve competitive performance compared tothe ones that use the same target language. Surprisingly, the models show asimilar performance on a same task regardless of the pretraining language. Forexample, models pretrained on distant languages such as German and Portugueseperform similarly on English tasks.