Lens: Rethinking Multilingual Enhancement for Large Language Models

  • 2024-10-06 09:51:30
  • Weixiang Zhao, Yulin Hu, Jiahe Guo, Xingyu Sui, Tongtong Wu, Yang Deng, Yanyan Zhao, Bing Qin, Wanxiang Che, Ting Liu
  • 0

Abstract

Despite the growing global demand for large language models (LLMs) that serveusers from diverse linguistic backgrounds, most cutting-edge LLMs remainpredominantly English-centric. This creates a performance gap across languages,restricting access to advanced AI services for non-English speakers. Currentmethods to enhance multilingual capabilities largely rely on data-drivenpost-training techniques, such as multilingual instruction tuning or continualpre-training. However, these approaches encounter significant challenges,including the scarcity of high-quality multilingual datasets and the limitedenhancement of multilingual capabilities. They often suffer from off-targetissues and catastrophic forgetting of central language abilities. To this end,we propose Lens, a novel approach to enhance multilingual capabilities of LLMsby leveraging their internal language representation spaces. Specially, Lensoperates by manipulating the hidden representations within thelanguage-agnostic and language-specific subspaces from top layers of LLMs.Using the central language as a pivot, the target language is drawn closer toit within the language-agnostic subspace, allowing it to inheritwell-established semantic representations. Meanwhile, in the language-specificsubspace, the representations of the target and central languages are pushedapart, enabling the target language to express itself distinctly. Extensiveexperiments on one English-centric and two multilingual LLMs demonstrate thatLens effectively improves multilingual performance without sacrificing theoriginal central language capabilities of the backbone model, achievingsuperior results with much fewer computational resources compared to existingpost-training approaches.

 

Quick Read (beta)

loading the full paper ...