ReCode: Updating Code API Knowledge with Reinforcement Learning

Abstract

Large Language Models (LLMs) exhibit remarkable code generation capabilitiesbut falter when adapting to frequent updates in external library APIs. Thiscritical limitation, stemming from reliance on outdated API knowledge fromtheir training data, even with access to current documentation, impedesreliable code generation in dynamic environments. To tackle this issue, wepropose ReCode (rule-based Reinforcement learning for Code Update), a novelframework that mimics human programmer adaptation to API changes. Specifically,we construct a dataset of approximately 2,000 data entries to train the LLMs toperform version migration based on updated information. Then, we introduce amodified string similarity metric for code evaluation as the reward forreinforcement learning. Our experiments demonstrate that ReCode substantiallyboosts LLMs' code generation performance in dynamic API scenarios, especiallyon the unseen CodeUpdateArena task. Crucially, compared to supervisedfine-tuning, ReCode has less impact on LLMs' general code generation abilities.We apply ReCode on various LLMs and reinforcement learning algorithms (GRPO andDAPO), all achieving consistent improvements. Notably, after training,Qwen2.5-Coder-7B outperforms that of the 32B parameter code instruction-tunedmodel and the reasoning model with the same architecture. Code is available athttps://github.com/zjunlp/ReCode.

Quick Read (beta)

loading the full paper ...