Data Augmentation for Voice-Assistant NLU using BERT-based Interchangeable Rephrase

  • 2021-04-16 17:53:58
  • Akhila Yerukola, Mason Bretan, Hongxia Jin
  • 1

Abstract

We introduce a data augmentation technique based on byte pair encoding and aBERT-like self-attention model to boost performance on spoken languageunderstanding tasks. We compare and evaluate this method with a range ofaugmentation techniques encompassing generative models such as VAEs andperformance-boosting techniques such as synonym replacement andback-translation. We show our method performs strongly on domain and intentclassification tasks for a voice assistant and in a user-study focused onutterance naturalness and semantic similarity.

 

Quick Read (beta)

loading the full paper ...