Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

  • 2025-08-01 03:36:47
  • Luoxin Chen, Jinming Gu, Liankai Huang, Wenhao Huang, Zhicheng Jiang, Allan Jie, Xiaoran Jin, Xing Jin, Chenggang Li, Kaijing Ma, Cheng Ren, Jiawei Shen, Wenlei Shi, Tong Sun, He Sun, Jiahui Wang, Siran Wang, Zhihong Wang, Chenrui Wei, Shufa Wei, Yonghui Wu, Yuchen Wu, Yihang Xia, Huajian Xin, Fan Yang, Huaiyuan Ying, Hongyi Yuan, Zheng Yuan, Tianyang Zhan, Chi Zhang, Yue Zhang, Ge Zhang, Tianyun Zhao, Jianqiu Zhao, Yichi Zhou, Thomas Hanwen Zhu
  • 0

Abstract

LLMs have demonstrated strong mathematical reasoning abilities by leveragingreinforcement learning with long chain-of-thought, yet they continue tostruggle with theorem proving due to the lack of clear supervision signals whensolely using natural language. Dedicated domain-specific languages like Leanprovide clear supervision via formal verification of proofs, enabling effectivetraining through reinforcement learning. In this work, we propose\textbf{Seed-Prover}, a lemma-style whole-proof reasoning model. Seed-Provercan iteratively refine its proof based on Lean feedback, proved lemmas, andself-summarization. To solve IMO-level contest problems, we design threetest-time inference strategies that enable both deep and broad reasoning.Seed-Prover proves $78.1\%$ of formalized past IMO problems, saturates MiniF2F,and achieves over 50\% on PutnamBench, outperforming the previousstate-of-the-art by a large margin. To address the lack of geometry support inLean, we introduce a geometry reasoning engine \textbf{Seed-Geometry}, whichoutperforms previous formal geometry engines. We use these two systems toparticipate in IMO 2025 and fully prove 5 out of 6 problems. This workrepresents a significant advancement in automated mathematical reasoning,demonstrating the effectiveness of formal verification with longchain-of-thought reasoning.

 

Quick Read (beta)

loading the full paper ...