Abstract
Implicit bias refers to automatic or spontaneous mental processes that shapeperceptions, judgments, and behaviors. Previous research examining `implicitbias' in large language models (LLMs) has often approached the phenomenondifferently than how it is studied in humans by focusing primarily on modeloutputs rather than on model processing. To examine model processing, wepresent a method called the Reasoning Model Implicit Association Test (RM-IAT)for studying implicit bias-like patterns in reasoning models: LLMs that employstep-by-step reasoning to solve complex tasks. Using this method, we find thatreasoning models require more tokens when processing association-incompatibleinformation compared to association-compatible information. These findingssuggest AI systems harbor patterns in processing information that are analogousto human implicit bias. We consider the implications of these implicitbias-like patterns for their deployment in real-world applications.