Hallucination-Resistant Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

Abstract

Relation extraction (RE) enables the construction of structured knowledge for many downstream applications. While large language models (LLMs) have shown great promise in this task, they often struggle to reliably determine whether a relation exists, particularly in sentences with complex syntax or subtle semantics. For instance, we find that Qwen2.5-14B-Instruct incorrectly predicts a relation in 96.9% of NO-RELATION instances on SciERC, revealing a severe hallucination problem. To address these challenges, we propose DEPTH, a framework that integrates Dependency-aware sEntence simPlification and Two-tiered Hierarchical refinement into the relation extraction pipeline. Given a sentence and its candidate entity pairs, DEPTH operates in two stages: (1) the Grounding module extracts relations for each pair by leveraging their shortest dependency path, distilling the sentence into a minimal yet coherent relational context that reduces syntactic noise while preserving key semantics; (2) the Refinement module aggregates all local predictions and revises them based on a holistic understanding of the sentence, correcting omissions and inconsistencies. We further introduce a causality-driven reward model that mitigates reward hacking by disentangling spurious correlations, enabling robust fine-tuning via reinforcement learning with human feedback. Experiments on eight well-established benchmarks demonstrate that DEPTH reduces the average hallucination rate to 7.9% while achieving a 9.3% improvement in average F1 score over existing LLM-based extraction baselines.

Quick Read (beta)

loading the full paper ...