Revealing the Parallel Multilingual Learning within Large Language Models

  • 2024-10-08 05:03:16
  • Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu
  • 0

Abstract

In this study, we reveal an in-context learning (ICL) capability ofmultilingual large language models (LLMs): by translating the input to severallanguages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, whichsignificantly enhances their comprehension abilities. To test this capability,we design extensive experiments encompassing 8 typical datasets, 7 languagesand 8 state-of-the-art multilingual LLMs. Experimental results show that (1)incorporating more languages help PiM surpass the conventional ICL further; (2)even combining with the translations that are inferior to baseline performancecan also help. Moreover, by examining the activated neurons in LLMs, wediscover a counterintuitive but interesting phenomenon. Contrary to the commonthought that PiM would activate more neurons than monolingual input to leverageknowledge learned from diverse languages, PiM actually inhibits neurons andpromotes more precise neuron activation especially when more languages areadded. This phenomenon aligns with the neuroscience insight about synapticpruning, which removes less used neural connections, strengthens remainders,and then enhances brain intelligence.

 

Quick Read (beta)

loading the full paper ...