Abstract
As global demand for multilingual large language models (LLMs) grows, mostLLMs still remain overly focused on English, leading to the limited access toadvanced AI for non-English speakers. Current methods to enhance multilingualcapabilities largely rely on data-driven post-training techniques, such asmultilingual instruction tuning or continual pre-training. However, theseapproaches exhibit significant limitations, including high resource cost,exacerbation of off-target issue and catastrophic forgetting of centrallanguage abilities. To this end, we propose Lens, a novel approach thatenhances multilingual capabilities by leveraging LLMs' internal languagerepresentation spaces. Lens operates on two subspaces: the language-agnosticsubspace, where it aligns target languages with the central language to inheritstrong semantic representations, and the language-specific subspace, where itseparates target and central languages to preserve linguistic specificity.Experiments on three English-centric LLMs show that Lens significantly improvesmultilingual performance while maintaining the model's English proficiency,achieving better results with less computational cost compared to existingpost-training approaches.