Code Switched and Code Mixed Speech Recognition for Indic languages

Abstract

Training multilingual automatic speech recognition (ASR) systems ischallenging because acoustic and lexical information is typically languagespecific. Training multilingual system for Indic languages is even more tougherdue to lack of open source datasets and results on different approaches. Wecompare the performance of end to end multilingual speech recognition system tothe performance of monolingual models conditioned on language identification(LID). The decoding information from a multilingual model is used for languageidentification and then combined with monolingual models to get an improvementof 50% WER across languages. We also propose a similar technique to solve theCode Switched problem and achieve a WER of 21.77 and 28.27 over Hindi-Englishand Bengali-English respectively. Our work talks on how transformer based ASRespecially wav2vec 2.0 can be applied in developing multilingual ASR and codeswitched ASR for Indic languages.

Quick Read (beta)

loading the full paper ...