Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 288 papers

Title	Date	Tasks	Status	Hype
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization	Jul 8, 2025	Active LearningAutomated Theorem Proving	CodeCode Available	1
Prover Agent: An Agent-based Framework for Formal Mathematical Proofs	Jun 24, 2025	AI AgentAutomated Theorem Proving	—Unverified	0
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving	Jun 20, 2025	Automated Theorem ProvingDiversity	—Unverified	0
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?	Jun 6, 2025	Automated Theorem ProvingVisual Reasoning	—Unverified	0
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification	Jun 5, 2025	Automated Theorem ProvingHallucination	CodeCode Available	1
LeanExplore: A search engine for Lean 4 declarations	Jun 4, 2025	Automated Theorem Proving	CodeCode Available	2
Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening	Jun 3, 2025	Automated Theorem Proving	—Unverified	0
Faithful and Robust LLM-Driven Theorem Proving for NLI Explanations	May 30, 2025	Automated Theorem ProvingNatural Language Inference	—Unverified	0
ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction	May 30, 2025	Automated Theorem Proving	—Unverified	0
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning	May 29, 2025	Automated Theorem ProvingMathematical Reasoning	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 29Next →

All datasets miniF2F-test miniF2F-valid HolStep (Conditional)HOList benchmark HolStep (Unconditional)Metamath set.mm miniF2F-curriculum CompCert CoqGym

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Kimina-Prover-Preview	cumulative	80.74	—	Unverified
2	ProofAug	cumulative	66	—	Unverified
3	DeepSeek-Prover-V1.5	cumulative	63.5	—	Unverified
4	Subgoal-XL	cumulative	56.1	—	Unverified
5	DeepSeek-Prover	cumulative	52	—	Unverified
6	LEGO-Prover ChatGPT	cumulative	47.1	—	Unverified
7	Lyra + GPT-4	cumulative	47.1	—	Unverified
8	Decomposing the Enigma	cumulative	45.5	—	Unverified
9	Evariste	cumulative	41	—	Unverified
10	Evariste-7d	cumulative	40.6	—	Unverified