Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages

  • 2021-09-12 11:45:26
  • Anoop C S, Prathosh A P, A G Ramakrishnan
  • 3


Building an automatic speech recognition (ASR) system from scratch requires alarge amount of annotated speech data, which is difficult to collect in manylanguages. However, there are cases where the low-resource language shares acommon acoustic space with a high-resource language having enough annotateddata to build an ASR. In such cases, we show that the domain-independentacoustic models learned from the high-resource language through unsuperviseddomain adaptation (UDA) schemes can enhance the performance of the ASR in thelow-resource language. We use the specific example of Hindi in the sourcedomain and Sanskrit in the target domain. We explore two architectures: i)domain adversarial training using gradient reversal layer (GRL) and ii) domainseparation networks (DSN). The GRL and DSN architectures give absoluteimprovements of 6.71% and 7.32%, respectively, in word error rate over thebaseline deep neural network model when trained on just 5.5 hours of data inthe target domain. We also show that choosing a proper language (Telugu) in thesource domain can bring further improvement. The results suggest that UDAschemes can be helpful in the development of ASR systems for low-resourcelanguages, mitigating the hassle of collecting large amounts of annotatedspeech data.


Quick Read (beta)

loading the full paper ...