The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence

Abstract

As the production of and reliance on datasets to produce automateddecision-making systems (ADS) increases, so does the need for processes forevaluating and interrogating the underlying data. After launching the DatasetNutrition Label in 2018, the Data Nutrition Project has made significantupdates to the design and purpose of the Label, and is launching an updatedLabel in late 2020, which is previewed in this paper. The new Label includescontext-specific Use Cases &Alerts presented through an updated design and userinterface targeted towards the data scientist profile. This paper discusses theharm and bias from underlying training data that the Label is intended tomitigate, the current state of the work including new datasets being labeled,new and existing challenges, and further directions of the work, as well asFigures previewing the new label.

Quick Read (beta)

loading the full paper ...