A Parameter-efficient Language Extension Framework for Multilingual ASR

Abstract

Covering all languages with a multilingual speech recognition model (MASR) isvery difficult. Performing language extension on top of an existing MASR is adesirable choice. In this study, the MASR continual learning problem isprobabilistically decomposed into language identity prediction (LP) andcross-lingual adaptation (XLA) sub-problems. Based on this, we propose anarchitecture-based framework for language extension that can fundamentallysolve catastrophic forgetting, debudded as PELE. PELE is designed to beparameter-efficient, incrementally incorporating an add-on module to adapt to anew language. Specifically, different parameter-efficient fine-tuning (PEFT)modules and their variants are explored as potential candidates to perform XLA.Experiments are carried out on 5 new languages with a wide range oflow-resourced data sizes. The best-performing PEFT candidate can achievesatisfactory performance across all languages and demonstrates superiority inthree of five languages over the continual joint learning setting. Notably,PEFT methods focusing on weight parameters or input features are revealed to belimited in performance, showing significantly inferior extension capabilitiescompared to inserting a lightweight module in between layers such as anAdapter.

Quick Read (beta)

loading the full paper ...