50 Years of Test (Un)fairness: Lessons for Machine Learning

Abstract

Quantitative definitions of what is unfair and what is fair have beenintroduced in multiple disciplines for well over 50 years, including ineducation, hiring, and machine learning. We trace how the notion of fairnesshas been defined within the testing communities of education and hiring overthe past half century, exploring the cultural and social context in whichdifferent fairness definitions have emerged. In some cases, earlier definitionsof fairness are similar or identical to definitions of fairness in currentmachine learning research, and foreshadow current formal work. In other cases,insights into what fairness means and how to measure it have largely goneoverlooked. We compare past and current notions of fairness along severaldimensions, including the fairness criteria, the focus of the criteria (e.g., atest, a model, or its use), the relationship of fairness to individuals,groups, and subgroups, and the mathematical method for measuring fairness(e.g., classification, regression). This work points the way towards futureresearch and measurement of (un)fairness that builds from our modernunderstanding of fairness while incorporating insights from the past.

Quick Read (beta)

loading the full paper ...