Abstract
Despite advances in the multilingual capabilities of Large Language Models(LLMs) across diverse tasks, English remains the dominant language for LLMresearch and development. So, when working with a different language, this hasled to the widespread practice of pre-translation, i.e., translating the taskprompt into English before inference. Selective pre-translation, a moresurgical approach, focuses on translating specific prompt components. However,its current use is sporagic and lacks a systematic research foundation.Consequently, the optimal pre-translation strategy for various multilingualsettings and tasks remains unclear. In this work, we aim to uncover the optimalsetup for pre-translation by systematically assessing its use. Specifically, weview the prompt as a modular entity, composed of four functional parts:instruction, context, examples, and output, either of which could be translatedor not. We evaluate pre-translation strategies across 35 languages coveringboth low and high-resource languages, on various tasks including QuestionAnswering (QA), Natural Language Inference (NLI), Named Entity Recognition(NER), and Abstractive Summarization. Our experiments show the impact offactors as similarity to English, translation quality and the size ofpre-trained data, on the model performance with pre-translation. We suggestpractical guidelines for choosing optimal strategies in various multilingualsettings.