Analyzing Context Contributions in LLM-based Machine Translation

Abstract

Large language models (LLMs) have achieved state-of-the-art performance inmachine translation (MT) and demonstrated the ability to leverage in-contextlearning through few-shot examples. However, the mechanisms by which LLMs usedifferent parts of the input context remain largely unexplored. In this work,we provide a comprehensive analysis of context utilization in MT, studying howLLMs use various context parts, such as few-shot examples and the source text,when generating translations. We highlight several key findings: (1) the sourcepart of few-shot examples appears to contribute more than its correspondingtargets, irrespective of translation direction; (2) finetuning LLMs withparallel data alters the contribution patterns of different context parts; and(3) there is a positional bias where earlier few-shot examples have highercontributions to the translated sequence. Finally, we demonstrate thatinspecting anomalous context contributions can potentially uncover pathologicaltranslations, such as hallucinations. Our findings shed light on the internalworkings of LLM-based MT which go beyond those known for standardencoder-decoder MT models.

Quick Read (beta)

loading the full paper ...