Transcending Language Boundaries: Harnessing LLMs for Low-Resource Language Translation

  • 2024-11-18 05:41:27
  • Peng Shu, Junhao Chen, Zhengliang Liu, Hui Wang, Zihao Wu, Tianyang Zhong, Yiwei Li, Huaqin Zhao, Hanqi Jiang, Yi Pan, Yifan Zhou, Constance Owl, Xiaoming Zhai, Ninghao Liu, Claudio Saunt, Tianming Liu
  • 0

Abstract

Large Language Models (LLMs) have demonstrated remarkable success across awide range of tasks and domains. However, their performance in low-resourcelanguage translation, particularly when translating into these languages,remains underexplored. This gap poses significant challenges, as linguisticbarriers hinder the cultural preservation and development of minoritycommunities. To address this issue, this paper introduces a novelretrieval-based method that enhances translation quality for low-resourcelanguages by focusing on key terms, which involves translating keywords andretrieving corresponding examples from existing data. To evaluate theeffectiveness of this method, we conducted experiments translating from Englishinto three low-resource languages: Cherokee, a critically endangered indigenouslanguage of North America; Tibetan, a historically and culturally significantlanguage in Asia; and Manchu, a language with few remaining speakers. Ourcomparison with the zero-shot performance of GPT-4o and LLaMA 3.1 405B,highlights the significant challenges these models face when translating intolow-resource languages. In contrast, our retrieval-based method shows promisein improving both word-level accuracy and overall semantic understanding byleveraging existing resources more effectively.

 

Quick Read (beta)

loading the full paper ...