Abstract
The automated repair of C++ compilation errors presents a significantchallenge, the resolution of which is critical for developer productivity.Progress in this domain is constrained by two primary factors: the scarcity oflarge-scale, high-fidelity datasets and the limitations of conventionalsupervised methods, which often fail to generate semantically correctpatches.This paper addresses these gaps by introducing a comprehensiveframework with three core contributions. First, we present CCrepair, a novel,large-scale C++ compilation error dataset constructed through a sophisticatedgenerate-and-verify pipeline. Second, we propose a Reinforcement Learning (RL)paradigm guided by a hybrid reward signal, shifting the focus from merecompilability to the semantic quality of the fix. Finally, we establish therobust, two-stage evaluation system providing this signal, centered on anLLM-as-a-Judge whose reliability has been rigorously validated against thecollective judgments of a panel of human experts. This integrated approachaligns the training objective with generating high-quality, non-trivial patchesthat are both syntactically and semantically correct. The effectiveness of ourapproach was demonstrated experimentally. Our RL-trained Qwen2.5-1.5B-Instructmodel achieved performance comparable to a Qwen2.5-14B-Instruct model,validating the efficiency of our training paradigm. Our work provides theresearch community with a valuable new dataset and a more effective paradigmfor training and evaluating robust compilation repair models, paving the wayfor more practical and reliable automated programming assistants.