Multilingual bottleneck features for subword modeling in zero-resource languages

Abstract

How can we effectively develop speech technology for languages where notranscribed data is available? Many existing approaches use no annotatedresources at all, yet it makes sense to leverage information from largeannotated corpora in other languages, for example in the form of multilingualbottleneck features (BNFs) obtained from a supervised speech recognitionsystem. In this work, we evaluate the benefits of BNFs for subword modeling(feature extraction) in six unseen languages on a word discrimination task.First we establish a strong unsupervised baseline by combining two existingmethods: vocal tract length normalisation (VTLN) and the correspondenceautoencoder (cAE). We then show that BNFs trained on a single language alreadybeat this baseline; including up to 10 languages results in additionalimprovements which cannot be matched by just adding more data from a singlelanguage. Finally, we show that the cAE can improve further on the BNFs ifhigh-quality same-word pairs are available.

Quick Read (beta)

loading the full paper ...