
DeepSeek-Prover-V2 is an open-source large language model designed for formal theorem proving in Lean 4. Its training relies heavily on synthetic data, generated by using DeepSeek-V3 to decompose problems into subgoals, which are then recursively solved by a smaller 7B prover model. The model uses a two-stage training process, including supervised fine-tuning and reinforcement learning (GRPO), to bridge informal reasoning with formal proofs. It achieves state-of-the-art performance, particularly with its high-precision Chain-of-Thought mode.