Automatically Neutralizing Subjective Bias in Text

Abstract

Texts like news, encyclopedias, and some social media strive for objectivity.Yet bias in the form of inappropriate subjectivity - introducing attitudes viaframing, presupposing truth, and casting doubt - remains ubiquitous. This kindof bias erodes our collective trust and fuels social conflict. To address thisissue, we introduce a novel testbed for natural language generation:automatically bringing inappropriately subjective text into a neutral point ofview ("neutralizing" biased text). We also offer the first parallel corpus ofbiased language. The corpus contains 180,000 sentence pairs and originates fromWikipedia edits that removed various framings, presuppositions, and attitudesfrom biased sentences. Last, we propose two strong encoder-decoder baselinesfor the task. A straightforward yet opaque CONCURRENT system uses a BERTencoder to identify subjective words as part of the generation process. Aninterpretable and controllable MODULAR algorithm separates these steps, using(1) a BERT-based classifier to identify problematic words and (2) a novel joinembedding through which the classifier can edit the hidden states of theencoder. Large-scale human evaluation across four domains (encyclopedias, newsheadlines, books, and political speeches) suggests that these algorithms are afirst step towards the automatic identification and reduction of bias.

Quick Read (beta)

loading the full paper ...