White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs

Abstract

Social biases can manifest in language agency. While several studiesapproached agency-related bias in human-written language, very limited researchhas investigated such biases in Large Language Model (LLM)-generated content.In addition, previous works often rely on string-matching techniques toidentify agentic and communal words within texts, which fall short ofaccurately classifying language agency. We introduce the novel Language AgencyBias Evaluation (LABE) benchmark, which comprehensively evaluates biases inLLMs by analyzing agency levels attributed to different demographic groups inmodel generations. LABE leverages 5,400 template-based prompts, an accurateagency classifier, and corresponding bias metrics to test for gender, racial,and intersectional language agency biases in LLMs on 3 text generation tasks:biographies, professor reviews, and reference letters. We also contribute theLanguage Agency Classification (LAC) dataset, consisting of 3,724 agentic andcommunal sentences. Using LABE, we unveil language agency social biases in 3recent LLMs: ChatGPT, Llama3, and Mistral. We observe that: (1) LLM generationstend to demonstrate greater gender bias than human-written texts; (2) Modelsdemonstrate remarkably higher levels of intersectional bias than the other biasaspects. Those who are at the intersection of gender and racial minoritygroups--such as Black females--are consistently described by texts with lowerlevels of agency, aligning with real-world social inequalities; (3) Among the 3LLMs investigated, Llama3 demonstrates the greatest overall bias; (4) Not onlydoes prompt-based mitigation fail to resolve language agency bias in LLMs, butit frequently leads to the exacerbation of biases in generated texts.

Quick Read (beta)

loading the full paper ...