Editing a classifier by rewriting its prediction rules

  • 2021-12-02 06:40:37
  • Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry
  • 33

Abstract

We present a methodology for modifying the behavior of a classifier bydirectly rewriting its prediction rules. Our approach requires virtually noadditional data collection and can be applied to a variety of settings,including adapting a model to new environments, and modifying it to ignorespurious features. Our code is available athttps://github.com/MadryLab/EditingClassifiers .

 

Quick Read (beta)

loading the full paper ...