Abstract
LLMs have demonstrated strong mathematical reasoning abilities by leveragingreinforcement learning with long chain-of-thought, yet they continue tostruggle with theorem proving due to the lack of clear supervision signals whensolely using natural language. Dedicated domain-specific languages like Leanprovide clear supervision via formal verification of proofs, enabling effectivetraining through reinforcement learning. In this work, we propose\textbf{Seed-Prover}, a lemma-style whole-proof reasoning model. Seed-Provercan iteratively refine its proof based on Lean feedback, proved lemmas, andself-summarization. To solve IMO-level contest problems, we design threetest-time inference strategies that enable both deep and broad reasoning.Seed-Prover proves $78.1\%$ of formalized past IMO problems, saturates MiniF2F,and achieves over 50\% on PutnamBench, outperforming the previousstate-of-the-art by a large margin. To address the lack of geometry support inLean, we introduce a geometry reasoning engine \textbf{Seed-Geometry}, whichoutperforms previous formal geometry engines. We use these two systems toparticipate in IMO 2025 and fully prove 5 out of 6 problems. This workrepresents a significant advancement in automated mathematical reasoning,demonstrating the effectiveness of formal verification with longchain-of-thought reasoning.