Natural- to formal-language generation using Tensor Product Representations

Abstract

Generating formal-language represented by relational tuples, such as Lispprograms or mathematical expressions, from a natural-language input is anextremely challenging task because it requires to explicitly capture discretesymbolic structural information from the input to generate the output. Moststate-of-the-art neural sequence models do not explicitly capture suchstructure information, and thus do not perform well on these tasks. In thispaper, we propose a new encoder-decoder model based on Tensor ProductRepresentations (TPRs) for Natural- to Formal-language generation, calledTP-N2F. The encoder of TP-N2F employs TPR 'binding' to encode natural-languagesymbolic structure in vector space and the decoder uses TPR 'unbinding' togenerate a sequence of relational tuples, each consisting of a relation (oroperation) and a number of arguments, in symbolic space. TP-N2F considerablyoutperforms LSTM-based Seq2Seq models, creating a new state of the art resultson two benchmarks: the MathQA dataset for math problem solving, and theAlgoList dataset for program synthesis. Ablation studies show that improvementsare mainly attributed to the use of TPRs in both the encoder and decoder toexplicitly capture relational structure information for symbolic reasoning.

Quick Read (beta)

loading the full paper ...