Multilingual Extractive Reading Comprehension by Runtime Machine Translation

Abstract

Despite recent work in Reading Comprehension (RC), progress has been mostlylimited to English due to the lack of large-scale datasets in other languages.In this work, we introduce the first RC system for languages without RCtraining data. Given a target language without RC training data and a pivotlanguage with RC training data (e.g. English), our method leverages existing RCresources in the pivot language by combining a competitive RC model in thepivot language with an attentive Neural Machine Translation (NMT) model. Wefirst translate the data from the target to the pivot language, and then obtainan answer using the RC model in the pivot language. Finally, we recover thecorresponding answer in the original language using soft-alignment attentionscores from the NMT model. We create evaluation sets of RC data in twonon-English languages, namely Japanese and French, to evaluate our method.Experimental results on these datasets show that our method significantlyoutperforms a back-translation baseline of a state-of-the-art product-levelmachine translation system.

Quick Read (beta)

loading the full paper ...