AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization

  • 2025-06-06 18:57:52
  • Mukur Gupta, Nikhil Reddy Varimalla, Nicholas Deas, Melanie Subbiah, Kathleen McKeown
  • 0

Abstract

Large Language Models (LLMs) have achieved impressive performance in textsummarization and are increasingly deployed in real-world applications.However, these systems often inherit associative and framing biases frompre-training data, leading to inappropriate or unfair outputs in downstreamtasks. In this work, we present AdvSumm (Adversarial Summarization), adomain-agnostic training framework designed to mitigate bias in textsummarization through improved generalization. Inspired by adversarialrobustness, AdvSumm introduces a novel Perturber component that appliesgradient-guided perturbations at the embedding level of Sequence-to-Sequencemodels, enhancing the model's robustness to input variations. We empiricallydemonstrate that AdvSumm effectively reduces different types of bias insummarization-specifically, name-nationality bias and political framingbias-without compromising summarization quality. Compared to standardtransformers and data augmentation techniques like back-translation, AdvSummachieves stronger bias mitigation performance across benchmark datasets.

 

Quick Read (beta)

loading the full paper ...