Repair-R1: Better Test Before Repair

Abstract

APR (Automated Program Repair) aims to automatically locate program defects,generate patches and validate the repairs. Existing techniques for APR areoften combined with LLMs (Large Language Models), which leverages thecode-related knowledge of LLMs to improve repair effectiveness. CurrentLLM-based APR methods typically utilize test cases only during the inferencestage, adopting an iterative approach that performs repair first and validatesit through test execution afterward. This conventional paradigm neglects twoimportant aspects: the potential contribution of test cases in the trainingphase, and the possibility of leveraging testing prior to repair. To addressthis, we propose Repair-R1, which introduces test cases into the model'straining phase and shifts test generation to precede repair. The model isrequired to first generate discriminative test cases that can distinguishdefective behaviors, and then perform repair based on these tests. This enablesthe model to better locate defects and understand the underlying causes ofdefects, thereby improving repair effectiveness. We implement Repair-R1 withthree different backbone models, using RL (reinforcement learning) toco-optimize test generation and bug repair. Experimental results on four widelyadopted benchmarks demonstrate the superiority of Repair-R1. Specially,compared to vanilla models, Repair-R1 improves repair success rate by 2.68\% to48.29\%, test generation success rate by 16.38\% to 53.28\%, and test coverageby 0.78\% to 53.96\%. We publish the code and weights athttps://github.com/Tomsawyerhu/APR-RL andhttps://huggingface.co/tomhu/Qwen3-4B-RL-5000-step.

Quick Read (beta)

loading the full paper ...