Can Reasoning Help Large Language Models Capture Human Annotator Disagreement?

Abstract

Variation in human annotation (i.e., disagreements) is common in NLP, oftenreflecting important information like task subjectivity and sample ambiguity.Modeling this variation is important for applications that are sensitive tosuch information. Although RLVR-style reasoning (Reinforcement Learning withVerifiable Rewards) has improved Large Language Model (LLM) performance on manytasks, it remains unclear whether such reasoning enables LLMs to captureinformative variation in human annotation. In this work, we evaluate theinfluence of different reasoning settings on LLM disagreement modeling. Wesystematically evaluate each reasoning setting across model sizes, distributionexpression methods, and steering methods, resulting in 60 experimental setupsacross 3 tasks. Surprisingly, our results show that RLVR-style reasoningdegrades performance in disagreement modeling, while naive Chain-of-Thought(CoT) reasoning improves the performance of RLHF LLMs (RL from human feedback).These findings underscore the potential risk of replacing human annotators withreasoning LLMs, especially when disagreements are important.

Quick Read (beta)

loading the full paper ...