Abstract
In this paper, we advocate for representation learning as the key tomitigating unfair prediction outcomes downstream. Motivated by a scenario wherelearned representations are used by third parties with unknown objectives, wepropose and explore adversarial representation learning as a natural method ofensuring those parties act fairly. We connect group fairness (demographicparity, equalized odds, and equal opportunity) to different adversarialobjectives. Through worst-case theoretical guarantees and experimentalvalidation, we show that the choice of this objective is crucial to fairprediction. Furthermore, we present the first in-depth experimentaldemonstration of fair transfer learning and demonstrate empirically that ourlearned representations admit fair predictions on new tasks while maintainingutility, an essential goal of fair representation learning.