Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance

Abstract

Large Language Models (LLMs) are known to process information using aproficient internal language consistently, referred to as latent language,which may differ from the input or output languages. However, how thediscrepancy between the latent language and the input and output languageaffects downstream task performance remains largely unexplored. While manystudies research the latent language of LLMs, few address its importance ininfluencing task performance. In our study, we hypothesize that thinking inlatent language consistently enhances downstream task performance. To validatethis, our work varies the input prompt languages across multiple downstreamtasks and analyzes the correlation between consistency in latent language andtask performance. We create datasets consisting of questions from diversedomains such as translation and geo-culture, which are influenced by the choiceof latent language. Experimental results across multiple LLMs on translationand geo-culture tasks, which are sensitive to the choice of language, indicatethat maintaining consistency in latent language is not always necessary foroptimal downstream task performance. This is because these models adapt theirinternal representations near the final layers to match the target language,reducing the impact of consistency on overall performance.

Quick Read (beta)

loading the full paper ...