Generalising Multilingual Concept-to-Text NLG with Language Agnostic Delexicalisation

Abstract

Concept-to-text Natural Language Generation is the task of expressing aninput meaning representation in natural language. Previous approaches in thistask have been able to generalise to rare or unseen instances by relying on adelexicalisation of the input. However, this often requires that the inputappears verbatim in the output text. This poses challenges in multilingualsettings, where the task expands to generate the output text in multiplelanguages given the same input. In this paper, we explore the application ofmultilingual models in concept-to-text and propose Language AgnosticDelexicalisation, a novel delexicalisation method that uses multilingualpretrained embeddings, and employs a character-level post-editing model toinflect words in their correct form during relexicalisation. Our experimentsacross five datasets and five languages show that multilingual modelsoutperform monolingual models in concept-to-text and that our frameworkoutperforms previous approaches, especially for low resource languages.

Quick Read (beta)

loading the full paper ...