Abstract
Large language models (LLMs) have demonstrated an impressive ability togenerate codes on competitive programming tasks. However, with limited samplenumbers, LLMs still suffer from poor accuracy. Inspired by the process of humanprogramming, we propose a generate-and-edit approach named Self-Edit thatutilizes execution results of the generated code from LLMs to improve the codequality on the competitive programming task. We execute the generated code onthe example test case provided in the question and wrap execution results intoa supplementary comment. Utilizing this comment as guidance, our fault-awarecode editor is employed to correct errors in the generated code. We performextensive evaluations across two competitive programming datasets with ninedifferent LLMs. Compared to directly generating from LLMs, our approach canimprove the average of pass@1 by 89\% on APPS-dev, 31\% on APPS-test, and 48\%on HumanEval over nine popular code generation LLMs with parameter sizesranging from 110M to 175B. Compared to other post-processing methods, ourmethod demonstrates superior accuracy and efficiency.