Pruning Multilingual Large Language Models for Multilingual Inference

Abstract

Multilingual large language models (MLLMs), trained on multilingual balanceddata, demonstrate better zero-shot learning performance in non-Englishlanguages compared to large language models trained on English-dominant data.However, the disparity in performance between English and non-English languagesremains a challenge yet to be fully addressed. A distinctive characteristic ofMLLMs is their high-quality translation capabilities, indicating an acquiredproficiency in aligning between languages. This study explores how to enhancethe zero-shot performance of MLLMs in non-English languages by leveraging theiralignment capability between English and non-English languages. To achievethis, we first analyze the behavior of MLLMs when performing translation andreveal that there are large magnitude features that play a critical role in thetranslation process. Inspired by these findings, we retain the weightsassociated with operations involving the large magnitude features and pruneother weights to force MLLMs to rely on these features for tasks beyondtranslation. We empirically demonstrate that this pruning strategy can enhancethe MLLMs' performance in non-English language.

Quick Read (beta)

loading the full paper ...