Assessing Translation capabilities of Large Language Models involving English and Indian Languages

Abstract

Generative Large Language Models (LLMs) have achieved remarkable advancementsin various NLP tasks. In this work, our aim is to explore the multilingualcapabilities of large language models by using machine translation as a taskinvolving English and 22 Indian languages. We first investigate the translationcapabilities of raw large language models, followed by exploring the in-contextlearning capabilities of the same raw models. We fine-tune these large languagemodels using parameter efficient fine-tuning methods such as LoRA andadditionally with full fine-tuning. Through our study, we have identified thebest performing large language model for the translation task involving LLMs,which is based on LLaMA. Our results demonstrate significant progress, with average BLEU scores of13.42, 15.93, 12.13, 12.30, and 12.07, as well as CHRF scores of 43.98, 46.99,42.55, 42.42, and 45.39, respectively, using 2-stage fine-tuned LLaMA-13b forEnglish to Indian languages on IN22 (conversational), IN22 (general),flores200-dev, flores200-devtest, and newstest2019 testsets. Similarly, forIndian languages to English, we achieved average BLEU scores of 14.03, 16.65,16.17, 15.35 and 12.55 along with chrF scores of 36.71, 40.44, 40.26, 39.51,and 36.20, respectively, using fine-tuned LLaMA-13b on IN22 (conversational),IN22 (general), flores200-dev, flores200-devtest, and newstest2019 testsets.Overall, our findings highlight the potential and strength of large languagemodels for machine translation capabilities, including for languages that arecurrently underrepresented in LLMs.

Quick Read (beta)

loading the full paper ...