The Art of Refusal: A Survey of Abstention in Large Language Models

Abstract

Abstention, the refusal of large language models (LLMs) to provide an answer,is increasingly recognized for its potential to mitigate hallucinations andenhance safety in building LLM systems. In this survey, we introduce aframework to examine abstention behavior from three perspectives: the query,the model, and human values. We review the literature on abstention methods(categorized based on the development stages of LLMs), benchmarks, andevaluation metrics, and discuss the merits and limitations of prior work. Wefurther identify and motivate areas for future research, such as encouragingthe study of abstention as a meta-capability across tasks and customizingabstention abilities based on context. In doing so, we aim to broaden the scopeand impact of abstention methodologies in AI systems.

Quick Read (beta)

loading the full paper ...