How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

  • 2023-09-15 18:33:24
  • Danni Liu, Jan Niehues
Customizing machine translation models to comply with fine-grained attributessuch as formality has seen tremendous progress recently. However, currentapproaches mostly rely on at least some supervised data with attributeannotation. Data scarcity therefore remains a bottleneck to democratizing suchcustomization possibilities to a wider range of languages, lower-resource onesin particular. Given recent progress in pretrained massively multilingualtranslation models, we use them as a foundation to transfer the attributecontrolling capabilities to languages without supervised data. In this work, wepresent a comprehensive analysis of transferring attribute controllers based ona pretrained NLLB-200 model. We investigate both training- and inference-timecontrol techniques under various data scenarios, and uncover their relativestrengths and weaknesses in zero-shot performance and domain robustness. Weshow that both paradigms are complementary, as shown by consistent improvementson 5 zero-shot directions. Moreover, a human evaluation on a real low-resourcelanguage, Bengali, confirms our findings on zero-shot transfer to new targetlanguages. The code is$\href{}{\text{here}}$.


