Can Knowledge Editing Really Correct Hallucinations?

Abstract

Large Language Models (LLMs) suffer from hallucinations, referring to thenon-factual information in generated content, despite their superior capacitiesacross tasks. Meanwhile, knowledge editing has been developed as a new popularparadigm to correct the erroneous factual knowledge encoded in LLMs with theadvantage of avoiding retraining from scratch. However, one common issue ofexisting evaluation datasets for knowledge editing is that they do not ensureLLMs actually generate hallucinated answers to the evaluation questions beforeediting. When LLMs are evaluated on such datasets after being edited bydifferent techniques, it is hard to directly adopt the performance to assessthe effectiveness of different knowledge editing methods in correctinghallucinations. Thus, the fundamental question remains insufficientlyvalidated: Can knowledge editing really correct hallucinations in LLMs? Weproposed HalluEditBench to holistically benchmark knowledge editing methods incorrecting real-world hallucinations. First, we rigorously construct a massivehallucination dataset with 9 domains, 26 topics and more than 6,000hallucinations. Then, we assess the performance of knowledge editing methods ina holistic way on five dimensions including Efficacy, Generalization,Portability, Locality, and Robustness. Through HalluEditBench, we have providednew insights into the potentials and limitations of different knowledge editingmethods in correcting hallucinations, which could inspire future improvementsand facilitate the progress in the field of knowledge editing.

Quick Read (beta)

loading the full paper ...