From English To Foreign Languages: Transferring Pre-trained Language Models

  • 2020-02-18 00:22:54
  • Ke Tran
  • 7

Abstract

Pre-trained models have demonstrated their effectiveness in many downstreamnatural language processing (NLP) tasks. The availability of multilingualpre-trained models enables zero-shot transfer of NLP tasks from high resourcelanguages to low resource ones. However, recent research in improvingpre-trained models focuses heavily on English. While it is possible to trainthe latest neural architectures for other languages from scratch, it isundesirable due to the required amount of compute. In this work, we tackle theproblem of transferring an existing pre-trained model from English to otherlanguages under a limited computational budget. With a single GPU, our approachcan obtain a foreign BERT base model within a day and a foreign BERT largewithin two days. Furthermore, evaluating our models on six languages, wedemonstrate that our models are better than multilingual BERT on two zero-shottasks: natural language inference and dependency parsing.

 

Quick Read (beta)

loading the full paper ...