Reducing language context confusion for end-to-end code-switching automatic speech recognition

Abstract

Code-switching is about dealing with alternative languages in thecommunication process. Training end-to-end (E2E) automatic speech recognition(ASR) systems for code-switching is known to be a challenging problem becauseof the lack of data compounded by the increased language context confusion dueto the presence of more than one language. In this paper, we propose alanguage-related attention mechanism to reduce multilingual context confusionfor the E2E code-switching ASR model based on the Equivalence Constraint Theory(EC). The linguistic theory requires that any monolingual fragment that occursin the code-switching sentence must occur in one of the monolingual sentences.It establishes a bridge between monolingual data and code-switching data. Bycalculating the respective attention of multiple languages, our method canefficiently transfer language knowledge from rich monolingual data. We evaluateour method on ASRU 2019 Mandarin-English code-switching challenge dataset.Compared with the baseline model, the proposed method achieves 11.37% relativemix error rate reduction.

Quick Read (beta)

loading the full paper ...