Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages

Abstract

(Short version of Abstract) This thesis describes an investigation onunsupervised acoustic modeling (UAM) for automatic speech recognition (ASR) inthe zero-resource scenario, where only untranscribed speech data is assumed tobe available. UAM is not only important in addressing the general problem ofdata scarcity in ASR technology development but also essential to manynon-mainstream applications, for examples, language protection, languageacquisition and pathological speech assessment. The present study is focused ontwo research problems. The first problem concerns unsupervised discovery ofbasic (subword level) speech units in a given language. Under the zero-resourcecondition, the speech units could be inferred only from the acoustic signals,without requiring or involving any linguistic direction and/or constraints. Thesecond problem is referred to as unsupervised subword modeling. In its essencea frame-level feature representation needs to be learned from untranscribedspeech. The learned feature representation is the basis of subword unitdiscovery. It is desired to be linguistically discriminative and robust tonon-linguistic factors. Particularly extensive use of cross-lingual knowledgein subword unit discovery and modeling is a focus of this research.

Quick Read (beta)

loading the full paper ...