FairHome: A Fair Housing and Fair Lending Dataset

  • 2024-09-09 19:34:26
  • Anusha Bagalkotkar, Aveek Karmakar, Gabriel Arnson, Ondrej Linda
  • 0

Abstract

We present a Fair Housing and Fair Lending dataset (FairHome): A dataset witharound 75,000 examples across 9 protected categories. To the best of ourknowledge, FairHome is the first publicly available dataset labeled with binarylabels for compliance risk in the housing domain. We demonstrate the usefulnessand effectiveness of such a dataset by training a classifier and using it todetect potential violations when using a large language model (LLM) in thecontext of real-estate transactions. We benchmark the trained classifieragainst state-of-the-art LLMs including GPT-3.5, GPT-4, LLaMA-3, and MistralLarge in both zero-shot and few-shot contexts. Our classifier outperformed withan F1-score of 0.91, underscoring the effectiveness of our dataset.

 

Quick Read (beta)

loading the full paper ...