QAGAN: Adversarial Approach To Learning Domain Invariant Language Features

Abstract

Training models that are robust to data domain shift has gained an increasinginterest both in academia and industry. Question-Answering language models,being one of the typical problem in Natural Language Processing (NLP) research,has received much success with the advent of large transformer models. However,existing approaches mostly work under the assumption that data is drawn fromsame distribution during training and testing which is unrealistic andnon-scalable in the wild. In this paper, we explore adversarial training approach towards learningdomain-invariant features so that language models can generalize well toout-of-domain datasets. We also inspect various other ways to boost our modelperformance including data augmentation by paraphrasing sentences, conditioningend of answer span prediction on the start word, and carefully designedannealing function. Our initial results show that in combination with thesemethods, we are able to achieve $15.2\%$ improvement in EM score and $5.6\%$boost in F1 score on out-of-domain validation dataset over the baseline. Wealso dissect our model outputs and visualize the model hidden-states byprojecting them onto a lower-dimensional space, and discover that our specificadversarial training approach indeed encourages the model to learn domaininvariant embedding and bring them closer in the multi-dimensional space.

Quick Read (beta)

loading the full paper ...