Cross-Lingual Transfer Learning for Question Answering

Abstract

Deep learning based question answering (QA) on English documents has achievedsuccess because there is a large amount of English training examples. However,for most languages, training examples for high-quality QA models are notavailable. In this paper, we explore the problem of cross-lingual transferlearning for QA, where a source language task with plentiful annotations isutilized to improve the performance of a QA model on a target language taskwith limited available annotations. We examine two different approaches. Amachine translation (MT) based approach translates the source language into thetarget language, or vice versa. Although the MT-based approach bringsimprovement, it assumes the availability of a sentence-level translationsystem. A GAN-based approach incorporates a language discriminator to learnlanguage-universal feature representations, and consequentially transferknowledge from the source language. The GAN-based approach rivals theperformance of the MT-based approach with fewer linguistic resources. Applyingboth approaches simultaneously yield the best results. We use two Englishbenchmark datasets, SQuAD and NewsQA, as source language data, and showsignificant improvements over a number of established baselines on a Chinese QAtask. We achieve the new state-of-the-art on the Chinese QA dataset.

Quick Read (beta)

loading the full paper ...