Zero-Resource Multilingual Model Transfer: Learning What to Share

Abstract

Modern natural language processing and understanding applications haveenjoyed a great boost utilizing neural networks models. However, this is notthe case for most languages especially low-resource ones with insufficientannotated training data. Cross-lingual transfer learning methods improve theperformance on a low-resource target language by leveraging labeled data fromother (source) languages, typically with the help of cross-lingual resourcessuch as parallel corpora. In this work, we propose a zero-resource multilingualtransfer learning model that can utilize training data in multiple sourcelanguages, while not requiring target language training data nor cross-lingualsupervision. Unlike existing methods that only rely on language-invariantfeatures for cross-lingual transfer, our approach utilizes bothlanguage-invariant and language-specific features in a coherent way. Our modelleverages adversarial networks to learn language-invariant features andmixture-of-experts models to dynamically exploit the relation between thetarget language and each individual source language. This enables our model tolearn effectively what to share between various languages in the multilingualsetup. It results in significant performance gains over prior art, as shown inan extensive set of experiments over multiple text classification and sequencetagging tasks including a large-scale real-world industry dataset.

Quick Read (beta)

loading the full paper ...