ReQA: An Evaluation for End-to-End Answer Retrieval Models

Abstract

Popular QA benchmarks like SQuAD have driven progress on the task ofidentifying answer spans within a specific passage, with models now surpassinghuman performance. However, retrieving relevant answers from a huge corpus ofdocuments is still a challenging problem, and places different requirements onthe model architecture. There is growing interest in developing scalable answerretrieval models trained end-to-end, bypassing the typical document retrievalstep. In this paper, we introduce Retrieval Question Answering (ReQA), abenchmark for evaluating large-scale sentence- and paragraph-level answerretrieval models. We establish baselines using both neural encoding models aswell as classical information retrieval techniques. We release our evaluationcode to encourage further work on this challenging task.

Quick Read (beta)

loading the full paper ...