DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition

  • 2025-04-30 17:57:48
  • Z. Z. Ren, Zhihong Shao, Junxiao Song, Huajian Xin, Haocheng Wang, Wanjia Zhao, Liyue Zhang, Zhe Fu, Qihao Zhu, Dejian Yang, Z. F. Wu, Zhibin Gou, Shirong Ma, Hongxuan Tang, Yuxuan Liu, Wenjun Gao, Daya Guo, Chong Ruan
  • 0

Abstract

We introduce DeepSeek-Prover-V2, an open-source large language model designedfor formal theorem proving in Lean 4, with initialization data collectedthrough a recursive theorem proving pipeline powered by DeepSeek-V3. Thecold-start training procedure begins by prompting DeepSeek-V3 to decomposecomplex problems into a series of subgoals. The proofs of resolved subgoals aresynthesized into a chain-of-thought process, combined with DeepSeek-V3'sstep-by-step reasoning, to create an initial cold start for reinforcementlearning. This process enables us to integrate both informal and formalmathematical reasoning into a unified model. The resulting model,DeepSeek-Prover-V2-671B, achieves state-of-the-art performance in neuraltheorem proving, reaching 88.9% pass ratio on the MiniF2F-test and solving 49out of 658 problems from PutnamBench. In addition to standard benchmarks, weintroduce ProverBench, a collection of 325 formalized problems, to enrich ourevaluation, including 15 selected problems from the recent AIME competitions(years 24-25). Further evaluation on these 15 AIME problems shows that themodel successfully solves 6 of them. In comparison, DeepSeek-V3 solves 8 ofthese problems using majority voting, highlighting that the gap between formaland informal mathematical reasoning in large language models is substantiallynarrowing.

 

Quick Read (beta)

loading the full paper ...