BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Abstract

The BLOOM model is a large publicly available multilingual language model,but its pretraining was limited to 46 languages. To extend the benefits ofBLOOM to other languages without incurring prohibitively large costs, it isdesirable to adapt BLOOM to new languages not seen during pretraining. In thiswork, we apply existing language adaptation strategies to BLOOM and benchmarkits zero-shot prompting performance on eight new languages in aresource-constrained setting. We find language adaptation to be effective atimproving zero-shot performance in new languages. Surprisingly, we find thatadapter-based finetuning is more effective than continued pretraining for largemodels. In addition, we discover that prompting performance is notsignificantly affected by language specifics, such as the writing system. It isprimarily determined by the size of the language adaptation data. We also addnew languages to BLOOMZ, which is a multitask finetuned version of BLOOMcapable of following task instructions zero-shot. We find including a newlanguage in the multitask fine-tuning mixture to be the most effective methodto teach BLOOMZ a new language. We conclude that with sufficient training datalanguage adaptation can generalize well to diverse languages. Our code isavailable at https://github.com/bigscience-workshop/multilingual-modeling.

Quick Read (beta)

loading the full paper ...