First Hallucination Tokens Are Different from Conditional Ones

  • 2025-10-01 16:26:55
  • Jakob Snel, Seong Joon Oh
  • 0

Abstract

Hallucination, the generation of untruthful content, is one of the majorconcerns regarding foundational models. Detecting hallucinations at the tokenlevel is vital for real-time filtering and targeted correction, yet thevariation of hallucination signals within token sequences is not fullyunderstood. Leveraging the RAGTruth corpus with token-level annotations andreproduced logits, we analyse how these signals depend on a token's positionwithin hallucinated spans, contributing to an improved understanding oftoken-level hallucination. Our results show that the first hallucinated tokencarries a stronger signal and is more detectable than conditional tokens. Werelease our analysis framework, along with code for logit reproduction andmetric computation at https://github.com/jakobsnl/RAGTruth\_Xtended.

 

Quick Read (beta)

loading the full paper ...