Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 288 papers

Title	Date	Tasks	Status	Hype
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities	May 19, 2025	Automated Theorem ProvingBenchmarking	CodeCode Available	1
Leanabell-Prover: Posttraining Scaling in Formal Reasoning	Apr 8, 2025	Automated Theorem Provingreinforcement-learning	CodeCode Available	1
MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving	Mar 5, 2025	Automated Theorem ProvingTransfer Learning	CodeCode Available	1
ProofWala: Multilingual Proof Data Synthesis and Theorem-Proving	Feb 7, 2025	Automated Theorem Proving	CodeCode Available	1
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis	Jan 30, 2025	Automated Theorem ProvingMath	CodeCode Available	1
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time	Oct 28, 2024	Automated Theorem ProvingCode Generation	CodeCode Available	1
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving	Aug 20, 2024	Automated Theorem Proving	CodeCode Available	1
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts	Jul 3, 2024	Automated Theorem ProvingCode Generation	CodeCode Available	1
Proving Theorems Recursively	May 23, 2024	Automated Theorem Proving	CodeCode Available	1
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization	Mar 26, 2024	Automated Theorem ProvingGSM8K	CodeCode Available	1
LeanReasoner: Boosting Complex Logical Reasoning with Lean	Mar 20, 2024	Automated Theorem ProvingLogical Reasoning	CodeCode Available	1
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data	Feb 14, 2024	Automated Theorem ProvingLanguage Modelling	CodeCode Available	1
An In-Context Learning Agent for Formal Theorem-Proving	Oct 6, 2023	Automated Theorem ProvingIn-Context Learning	CodeCode Available	1
LEGO-Prover: Neural Theorem Proving with Growing Libraries	Oct 1, 2023	Automated Theorem Proving	CodeCode Available	1
Lyra: Orchestrating Dual Correction in Automated Theorem Proving	Sep 27, 2023	Automated Theorem ProvingHallucination	CodeCode Available	1
FIMO: A Challenge Formal Dataset for Automated Theorem Proving	Sep 8, 2023	Automated Theorem Proving	CodeCode Available	1
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving	May 25, 2023	Automated Theorem Proving	CodeCode Available	1
An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations	May 15, 2023	Automated Theorem ProvingTransfer Learning	CodeCode Available	1
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics	Feb 24, 2023	Abstract AlgebraAutomated Theorem Proving	CodeCode Available	1
Peano: Learning Formal Mathematical Reasoning	Nov 29, 2022	Automated Theorem ProvingMathematical Reasoning	CodeCode Available	1
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs	Oct 21, 2022	Automated Theorem ProvingLanguage Modeling	CodeCode Available	1
NaturalProver: Grounded Mathematical Proof Generation with Language Models	May 25, 2022	Automated Theorem ProvingLanguage Modeling	CodeCode Available	1
Linear algebra with transformers	Dec 3, 2021	Automated Theorem ProvingFew-Shot Learning	CodeCode Available	1
An energy-based model for neuro-symbolic reasoning on knowledge graphs	Oct 4, 2021	Automated Theorem ProvingGraph Embedding	CodeCode Available	1
Neural Unification for Logic Reasoning over Natural Language	Sep 17, 2021	Automated Theorem ProvingQuestion Answering	CodeCode Available	1

Show:10 25 50

← PrevPage 2 of 12Next →

All datasets miniF2F-test miniF2F-valid HolStep (Conditional)HOList benchmark HolStep (Unconditional)Metamath set.mm miniF2F-curriculum CompCert CoqGym

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Kimina-Prover-Preview	cumulative	80.74	—	Unverified
2	ProofAug	cumulative	66	—	Unverified
3	DeepSeek-Prover-V1.5	cumulative	63.5	—	Unverified
4	Subgoal-XL	cumulative	56.1	—	Unverified
5	DeepSeek-Prover	cumulative	52	—	Unverified
6	Lyra + GPT-4	cumulative	47.1	—	Unverified
7	LEGO-Prover ChatGPT	cumulative	47.1	—	Unverified
8	Decomposing the Enigma	cumulative	45.5	—	Unverified
9	Evariste	cumulative	41	—	Unverified
10	Evariste-7d	cumulative	40.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@64	58.6	—	Unverified
2	LEGO-Prover ChatGPT	Pass@100	57	—	Unverified
3	Lyra + GPT-4	Pass@100	52	—	Unverified
4	Evariste-7d	Pass@64	47.5	—	Unverified
5	GPT-f	Pass@64	47.3	—	Unverified
6	Evariste-1d	Pass@64	46.7	—	Unverified
7	DSP (62B Minerva informal)	Pass@100	43.9	—	Unverified
8	Lean GPT-f	Pass@8	29.3	—	Unverified
9	Lean tidy	Pass@1	16.8	—	Unverified
10	Metamath GPT-f	Pass@8	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MPNN-DagLSTM	Classification Accuracy	0.92	—	Unverified
2	FormulaNet	Classification Accuracy	0.9	—	Unverified
3	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
4	Siamese 1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified
5	Siamese 1D CNN	Classification Accuracy	0.82	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	4-hop GNN, sub-expression sharing	Percentage correct	49.95	—	Unverified
2	Tactic Dependent Loop	Percentage correct	38.88	—	Unverified
3	BoW2 (extra -ves)	Percentage correct	36.55	—	Unverified
4	Deeper Wider WaveNet	Percentage correct	32.65	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FormulaNet	Classification Accuracy	0.9	—	Unverified
2	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
3	1D CNN	Classification Accuracy	0.83	—	Unverified
4	1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@32	72.4	—	Unverified
2	GPT-f	Percentage correct	56.2	—	Unverified
3	MetaGen-IL + Holophrasm	Percentage correct	22.1	—	Unverified
4	Holophrasm	Percentage correct	14.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste-7d	Pass@64	42.5	—	Unverified
2	Evariste-1d	Pass@64	33.6	—	Unverified
3	Evariste	Pass@64	32.1	—	Unverified
4	GPT-f	Pass@64	30.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Proverbot9001	Percentage correct	19.36	—	Unverified
2	CoqGym/ASTactic	Percentage correct	4.99	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASTactic	Percentage correct	12.2	—	Unverified