Abstract
Large Language Models (LLMs) increasingly incorporate multilingualcapabilities, fueling the demand to transfer them into target language-specificmodels. However, most approaches, which blend the source model's embedding byreplacing the source vocabulary with the target language-specific vocabulary,may constrain expressive capacity in the target language since the source modelis predominantly trained on English data. In this paper, we propose SemanticAware Linear Transfer (SALT), a novel cross-lingual transfer technique thatrecycles embeddings from target language Pre-trained Language Models (PLMs) totransmit the deep representational strengths of PLM-derived embedding to LLMs.SALT derives unique regression lines based on the similarity in the overlap ofthe source and target vocabularies, to handle each non-overlapping token'sembedding space. Our extensive experiments show that SALT significantlyoutperforms other transfer methods and achieves lower loss with acceleratingfaster convergence during language adaptation. Notably, SALT obtains remarkableperformance in cross-lingual understanding setups compared to other methods.Furthermore, we highlight the scalable use of PLMs to enhance the functionalityof contemporary LLMs by conducting experiments with varying architectures.