Abstract
Current pre-training works in natural language generation pay littleattention to the problem of exposure bias on downstream tasks. To address thisissue, we propose an enhanced multi-flow sequence to sequence pre-training andfine-tuning framework named ERNIE-GEN, which bridges the discrepancy betweentraining and inference with an infilling generation mechanism and a noise-awaregeneration method. To make generation closer to human writing patterns, thisframework introduces a span-by-span generation flow that trains the model topredict semantically-complete spans consecutively rather than predicting wordby word. Unlike existing pre-training methods, ERNIE-GEN incorporatesmulti-granularity target sampling to construct pre-training data, whichenhances the correlation between encoder and decoder. Experimental resultsdemonstrate that ERNIE-GEN achieves state-of-the-art results with a muchsmaller amount of pre-training data and parameters on a range of languagegeneration tasks, including abstractive summarization (Gigaword andCNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat)and generative question answering (CoQA).