DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition

Abstract

We introduce DeepSeek-Prover-V2, an open-source large language model designedfor formal theorem proving in Lean 4, with initialization data collectedthrough a recursive theorem proving pipeline powered by DeepSeek-V3. Thecold-start training procedure begins by prompting DeepSeek-V3 to decomposecomplex problems into a series of subgoals. The proofs of resolved subgoals aresynthesized into a chain-of-thought process, combined with DeepSeek-V3'sstep-by-step reasoning, to create an initial cold start for reinforcementlearning. This process enables us to integrate both informal and formalmathematical reasoning into a unified model. The resulting model,DeepSeek-Prover-V2-671B, achieves state-of-the-art performance in neuraltheorem proving, reaching 88.9% pass ratio on the MiniF2F-test and solving 49out of 658 problems from PutnamBench. In addition to standard benchmarks, weintroduce ProverBench, a collection of 325 formalized problems, to enrich ourevaluation, including 15 selected problems from the recent AIME competitions(years 24-25). Further evaluation on these 15 AIME problems shows that themodel successfully solves 6 of them. In comparison, DeepSeek-V3 solves 8 ofthese problems using majority voting, highlighting that the gap between formaland informal mathematical reasoning in large language models is substantiallynarrowing.

Quick Read (beta)

loading the full paper ...