VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models

Abstract

This research investigates both explicit and implicit social biases exhibitedby Vision-Language Models (VLMs). The key distinction between these bias typeslies in the level of awareness: explicit bias refers to conscious, intentionalbiases, while implicit bias operates subconsciously. To analyze explicit bias,we directly pose questions to VLMs related to gender and racial differences:(1) Multiple-choice questions based on a given image (e.g., "What is theeducation level of the person in the image?") (2) Yes-No comparisons using twoimages (e.g., "Is the person in the first image more educated than the personin the second image?") For implicit bias, we design tasks where VLMs assistusers but reveal biases through their responses: (1) Image description tasks:Models are asked to describe individuals in images, and we analyze disparitiesin textual cues across demographic groups. (2) Form completion tasks: Modelsdraft a personal information collection form with 20 attributes, and we examinecorrelations among selected attributes for potential biases. We evaluateGemini-1.5, GPT-4V, GPT-4o, LLaMA-3.2-Vision and LLaVA-v1.6. Our code and dataare publicly available at https://github.com/uscnlp-lime/VisBias.

Quick Read (beta)

loading the full paper ...