Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West

Abstract

Large Language Models (LLMs), now used daily by millions of users, can encodesocietal biases, exposing their users to representational harms. A large bodyof scholarship on LLM bias exists but it predominantly adopts a Western-centricframe and attends comparatively less to bias levels and potential harms in theGlobal South. In this paper, we quantify stereotypical bias in popular LLMsaccording to an Indian-centric frame and compare bias levels between the Indianand Western contexts. To do this, we develop a novel dataset which we callIndian-BhED (Indian Bias Evaluation Dataset), containing stereotypical andanti-stereotypical examples for caste and religion contexts. We find that themajority of LLMs tested are strongly biased towards stereotypes in the Indiancontext, especially as compared to the Western context. We finally investigateInstruction Prompting as a simple intervention to mitigate such bias and findthat it significantly reduces both stereotypical and anti-stereotypical biasesin the majority of cases for GPT-3.5. The findings of this work highlight theneed for including more diverse voices when evaluating LLMs.

Quick Read (beta)

loading the full paper ...