Open Question Answering over Tables and Text

Abstract

In open question answering (QA), the answer to a question is produced byretrieving and then analyzing documents that might contain answers to thequestion. Most open QA systems have considered only retrieving information fromunstructured text. Here we consider for the first time open QA over bothtabular and textual data and present a new large-scale dataset Open Table-TextQuestion Answering (OTT-QA) to evaluate performance on this task. Mostquestions in OTT-QA require multi-hop inference across tabular data andunstructured text, and the evidence required to answer a question can bedistributed in different ways over these two types of input, making evidenceretrieval challenging---our baseline model using an iterative retriever andBERT-based reader achieves an exact match score less than 10%. We then proposetwo novel techniques to address the challenge of retrieving and aggregatingevidence for OTT-QA. The first technique is to use "early fusion" to groupmultiple highly relevant tabular and textual units into a fused block, whichprovides more context for the retriever to search for. The second technique isto use a cross-block reader to model the cross-dependency between multipleretrieved evidences with global-local sparse attention. Combining these twotechniques improves the score significantly, to above 27%.

Quick Read (beta)

loading the full paper ...