GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

  • 2022-11-13 18:59:15
  • Qingcheng Zeng, Lucas Garay, Peilin Zhou, Dading Chong, Yining Hua, Jiageng Wu, Yikang Pan, Han Zhou, Jie Yang
While large pre-trained models have transformed the field of natural languageprocessing (NLP), the high training cost and low cross-lingual availability ofsuch models prevent the new advances from being equally shared by users acrossall languages, especially the less spoken ones. To promote equal opportunitiesfor all language speakers in NLP research and to reduce energy consumption forsustainability, this study proposes an effective and energy-efficient frameworkGreenPLM that uses bilingual lexicons to directly translate language models ofone language into other languages at (almost) no additional cost. We validatethis approach in 18 languages and show that this framework is comparable to, ifnot better than, other heuristics trained with high cost. In addition, whengiven a low computational cost (2.5%), the framework outperforms the originalmonolingual language models in six out of seven tested languages. This approachcan be easily implemented, and we will release language models in 50 languagestranslated from English soon.


