Abstract
Neural network pruning compresses automatic speech recognition (ASR) modelseffectively. However, in multilingual ASR, language-agnostic pruning may leadto severe performance drops on some languages because language-agnostic pruningmasks may not fit all languages and discard important language-specificparameters. In this work, we present ASR pathways, a sparse multilingual ASRmodel that activates language-specific sub-networks ("pathways"), such that theparameters for each language are learned explicitly. With the overlappingsub-networks, the shared parameters can also enable knowledge transfer forlower-resource languages via joint multilingual training. We propose a novelalgorithm to learn ASR pathways, and evaluate the proposed method on 4languages with a streaming RNN-T model. Our proposed ASR pathways outperformboth dense models and a language-agnostically pruned model, and provide betterperformance on low-resource languages compared to the monolingual sparsemodels.