FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

  • 2019-08-14 01:42:41
  • Kimmo Kärkkäinen, Jungseock Joo
  • 35

Abstract

Existing public face datasets are strongly biased toward Caucasian faces, andother races (e.g., Latino) are significantly underrepresented. This can lead toinconsistent model accuracy, limit the applicability of face analytic systemsto non-White race groups, and adversely affect research findings based on suchskewed data. To mitigate the race bias in these datasets, we construct a novelface image dataset, containing 108,501 images, with an emphasis of balancedrace composition in the dataset. We define 7 race groups: White, Black, Indian,East Asian, Southeast Asian, Middle East, and Latino. Images were collectedfrom the YFCC-100M Flickr dataset and labeled with race, gender, and agegroups. Evaluations were performed on existing face attribute datasets as wellas novel image datasets to measure generalization performance. We find that themodel trained from our dataset is substantially more accurate on novel datasetsand the accuracy is consistent between race and gender groups.

 

Quick Read (beta)

loading the full paper ...