FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

  • 2022-05-16 18:39:14
  • David Wan, Mohit Bansal
  • 3

Abstract

We present FactPEGASUS, an abstractive summarization model that addresses theproblem of factuality during pre-training and fine-tuning: (1) We augment thesentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-trainingobjective to create pseudo-summaries that are both important and factual; (2)We introduce three complementary components for fine-tuning. The correctorremoves hallucinations present in the reference summary, the contrastor usescontrastive learning to better differentiate nonfactual summaries from factualones, and the connector bridges the gap between the pre-training andfine-tuning for better transfer of knowledge. Experiments on three downstreamtasks demonstrate that FactPEGASUS substantially improves factuality evaluatedby multiple automatic metrics and humans. Our thorough analysis suggests thatFactPEGASUS is more factual than using the original pre-training objective inzero-shot and few-shot settings, retains factual behavior more robustly thanstrong baselines, and does not rely entirely on becoming more extractive toimprove factuality. Our code and data are publicly available at:https://github.com/meetdavidwan/factpegasus

 

Quick Read (beta)

loading the full paper ...