Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity

  • 2018-10-11 06:43:28
  • Glorianna Jagfeld, Sabrina Jenne, Ngoc Thang Vu
  • 1

Abstract

We present a comparison of word-based and character-basedsequence-to-sequence models for data-to-text natural language generation, whichgenerate natural language descriptions for structured inputs. On the datasetsof two recent generation challenges, our models achieve comparable or betterautomatic evaluation results than the best challenge submissions. Subsequentdetailed statistical and human analyses shed light on the differences betweenthe two input representations and the diversity of the generated texts. In acontrolled experiment with synthetic training data generated from templates, wedemonstrate the ability of neural models to learn novel combinations of thetemplates and thereby generalize beyond the linguistic structures they weretrained on.

 

Quick Read (beta)

loading the full paper ...