Abstract
Code generation aims to automatically generate code from input requirements,significantly enhancing development efficiency. Recent large language models(LLMs) based approaches have shown promising results and revolutionized codegeneration task. Despite the promising performance, LLMs often generatecontents with hallucinations, especially for the code generation scenariorequiring the handling of complex contextual dependencies in practicaldevelopment process. Although previous study has analyzed hallucinations inLLM-powered code generation, the study is limited to standalone functiongeneration. In this paper, we conduct an empirical study to study thephenomena, mechanism, and mitigation of LLM hallucinations within morepractical and complex development contexts in repository-level generationscenario. First, we manually examine the code generation results from sixmainstream LLMs to establish a hallucination taxonomy of LLM-generated code.Next, we elaborate on the phenomenon of hallucinations, analyze theirdistribution across different models. We then analyze causes of hallucinationsand identify four potential factors contributing to hallucinations. Finally, wepropose an RAG-based mitigation method, which demonstrates consistenteffectiveness in all studied LLMs. The replication package including code,data, and experimental results is available athttps://github.com/DeepSoftwareAnalytics/LLMCodingHallucination