Meta-RAG on Large Codebases Using Code Summarization

Abstract

Large Language Model (LLM) systems have been at the forefront of appliedArtificial Intelligence (AI) research in a multitude of domains. One suchdomain is software development, where researchers have pushed the automation ofa number of code tasks through LLM agents. Software development is a complexecosystem, that stretches far beyond code implementation and well into therealm of code maintenance. In this paper, we propose a multi-agent system tolocalize bugs in large pre-existing codebases using information retrieval andLLMs. Our system introduces a novel Retrieval Augmented Generation (RAG)approach, Meta-RAG, where we utilize summaries to condense codebases by anaverage of 79.8\%, into a compact, structured, natural language representation.We then use an LLM agent to determine which parts of the codebase are criticalfor bug resolution, i.e. bug localization. We demonstrate the usefulness ofMeta-RAG through evaluation with the SWE-bench Lite dataset. Meta-RAG scores84.67 % and 53.0 % for file-level and function-level correct localizationrates, respectively, achieving state-of-the-art performance.

Quick Read (beta)

loading the full paper ...