XOR QA: Cross-lingual Open-Retrieval Question Answering

  • 2021-04-13 05:22:01
  • Akari Asai, Jungo Kasai, Jonathan H. Clark, Kenton Lee, Eunsol Choi, Hannaneh Hajishirzi
  • 0


Multilingual question answering tasks typically assume answers exist in thesame language as the question. Yet in practice, many languages face bothinformation scarcity -- where languages have few reference articles -- andinformation asymmetry -- where questions reference concepts from othercultures. This work extends open-retrieval question answering to across-lingual setting enabling questions from one language to be answered viaanswer content from another language. We construct a large-scale dataset builton questions from TyDi QA lacking same-language answers. Our task formulation,called Cross-lingual Open Retrieval Question Answering (XOR QA), includes 40kinformation-seeking questions from across 7 diverse non-English languages.Based on this dataset, we introduce three new tasks that involve cross-lingualdocument retrieval using multi-lingual and English resources. We establishbaselines with state-of-the-art machine translation systems and cross-lingualpretrained models. Experimental results suggest that XOR QA is a challengingtask that will facilitate the development of novel techniques for multilingualquestion answering. Our data and code are available athttps://nlp.cs.washington.edu/xorqa.


