Benchmarking Chinese Knowledge Rectification in Large Language Models

Abstract

While Large Language Models (LLMs) exhibit remarkable generativecapabilities, they are not without flaws, particularly in the form ofhallucinations. This issue is even more pronounced when LLMs are applied tospecific languages and domains. For example, LLMs may generate nonsenseinformation when handling Chinese ancient poetry, proverbs, or idioms, owing tothe lack of specific knowledge. To this end, this paper introduces a benchmarkfor rectifying Chinese knowledge in LLMs via knowledge editing. Specifically,we introduce a new Chinese dataset, CKnowEdit, by collecting seven type ofknowledge from various sources, including classical texts, idioms, and contentfrom Baidu Tieba Ruozhiba, thereby accounting for the unique polyphony,antithesis, and logical constructs inherent in the Chinese language. Throughthe analysis of this dataset, we uncover the challenges faced by current LLMsin mastering Chinese. Furthermore, our evaluation of state-of-the-art knowledgeediting techniques on this dataset unveil the substantial scope for advancementin the rectification of Chinese knowledge. Code and dataset are available athttps://github.com/zjunlp/EasyEdit.

Quick Read (beta)

loading the full paper ...