Delayed Impact of Fair Machine Learning

Abstract

Fairness in machine learning has predominantly been studied in staticclassification settings without concern for how decisions change the underlyingpopulation over time. Conventional wisdom suggests that fairness criteriapromote the long-term well-being of those groups they aim to protect. We study how static fairness criteria interact with temporal indicators ofwell-being, such as long-term improvement, stagnation, and decline in avariable of interest. We demonstrate that even in a one-step feedback model,common fairness criteria in general do not promote improvement over time, andmay in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria,contrasting the regimes in which these exhibit qualitatively differentbehavior. In addition, we find that a natural form of measurement errorbroadens the regime in which fairness criteria perform favorably. Our results highlight the importance of measurement and temporal modeling inthe evaluation of fairness criteria, suggesting a range of new challenges andtrade-offs.

Quick Read (beta)

loading the full paper ...