Neural Databases - Paper Detail

Abstract

In recent years, neural networks have shown impressive performance gains onlong-standing AI problems, and in particular, answering queries from naturallanguage text. These advances raise the question of whether they can beextended to a point where we can relax the fundamental assumption of databasemanagement, namely, that our data is represented as fields of a pre-definedschema. This paper presents a first step in answering that question. We describeNeuralDB, a database system with no pre-defined schema, in which updates andqueries are given in natural language. We develop query processing techniquesthat build on the primitives offered by the state of the art Natural LanguageProcessing methods. We begin by demonstrating that at the core, recent NLP transformers, poweredby pre-trained language models, can answer select-project-join queries if theyare given the exact set of relevant facts. However, they cannot scale tonon-trivial databases and cannot perform aggregation queries. Based on thesefindings, we describe a NeuralDB architecture that runs multiple Neural SPJoperators in parallel, each with a set of database sentences that can produceone of the answers to the query. The result of these operators is fed to anaggregation operator if needed. We describe an algorithm that learns how tocreate the appropriate sets of facts to be fed into each of the Neural SPJoperators. Importantly, this algorithm can be trained by the Neural SPJoperator itself. We experimentally validate the accuracy of NeuralDB and itscomponents, showing that we can answer queries over thousands of sentences withvery high accuracy.

Quick Read (beta)

loading the full paper ...