Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 288 papers

Title	Date	Tasks	Status	Hype	Score
A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving	Nov 5, 2019	Automated Theorem ProvingDeep Reinforcement Learning	CodeCode Available	1	5
Proof Artifact Co-training for Theorem Proving with Language Models	Feb 11, 2021	Automated Theorem ProvingImitation Learning	CodeCode Available	1	5
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics	Feb 24, 2023	Abstract AlgebraAutomated Theorem Proving	CodeCode Available	1	5
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving	May 25, 2023	Automated Theorem Proving	CodeCode Available	1	5
AI Descartes: Combining Data and Theory for Derivable Scientific Discovery	Sep 3, 2021	Automated Theorem ProvingBIG-bench Machine Learning	CodeCode Available	1	5
Measuring Systematic Generalization in Neural Proof Generation with Transformers	Sep 30, 2020	Automated Theorem ProvingLogical Reasoning	CodeCode Available	1	5
Neural Unification for Logic Reasoning over Natural Language	Sep 17, 2021	Automated Theorem ProvingQuestion Answering	CodeCode Available	1	5
An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations	May 15, 2023	Automated Theorem ProvingTransfer Learning	CodeCode Available	1	5
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis	Jan 30, 2025	Automated Theorem ProvingMath	CodeCode Available	1	5
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving	Aug 20, 2024	Automated Theorem Proving	CodeCode Available	1	5
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization	Mar 26, 2024	Automated Theorem ProvingGSM8K	CodeCode Available	1	5
An energy-based model for neuro-symbolic reasoning on knowledge graphs	Oct 4, 2021	Automated Theorem ProvingGraph Embedding	CodeCode Available	1	5
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities	May 19, 2025	Automated Theorem ProvingBenchmarking	CodeCode Available	1	5
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts	Jul 3, 2024	Automated Theorem ProvingCode Generation	CodeCode Available	1	5
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs	Oct 21, 2022	Automated Theorem ProvingLanguage Modeling	CodeCode Available	1	5
Scaling Synthetic Logical Reasoning Datasets with Context-Sensitive Declarative Grammars	Jun 16, 2024	Automated Theorem ProvingLogical Reasoning	CodeCode Available	0	5
Solving Geometry Problems: Combining Text and Diagram Interpretation	Sep 1, 2015	Automated Theorem ProvingMathematical Question Answering	CodeCode Available	0	5
A Survey on Mathematical Reasoning and Optimization with Large Language Models	Mar 22, 2025	Automated Theorem ProvingHeuristic Search	CodeCode Available	0	5
Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic	Oct 25, 2019	Automated Theorem ProvingBIG-bench Machine Learning	CodeCode Available	0	5
DeepMath - Deep Sequence Models for Premise Selection	Jun 14, 2016	Automated Theorem Proving	CodeCode Available	0	5
REFACTOR: Learning to Extract Theorems from Proofs	Feb 26, 2024	Automated Theorem Proving	CodeCode Available	0	5
Solving Quantified Modal Logic Problems by Translation to Classical Logics	Dec 19, 2022	Automated Theorem ProvingTranslation	CodeCode Available	0	5
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation	Oct 21, 2024	Automated Theorem ProvingContinual Pretraining	CodeCode Available	0	5
Premise Selection for Theorem Proving by Deep Graph Embedding	Sep 28, 2017	Automated Theorem ProvingGeneral Classification	CodeCode Available	0	5
Neural Theorem Provers Do Not Learn Rules Without Exploration	Jun 17, 2019	Automated Theorem ProvingDiagnostic	CodeCode Available	0	5
On the (In)feasibility of ML Backdoor Detection as an Hypothesis Testing Problem	Feb 26, 2024	Automated Theorem ProvingOut-of-Distribution Detection	CodeCode Available	0	5
OxKBC: Outcome Explanation for Factorization Based Knowledge Base Completion	Feb 14, 2020	Automated Theorem ProvingKnowledge Base Completion	CodeCode Available	0	5
Logically Consistent Adversarial Attacks for Soft Theorem Provers	Apr 29, 2022	Automated Theorem Proving	CodeCode Available	0	5
LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation	May 17, 2025	Automated Theorem ProvingSynthetic Data Generation	CodeCode Available	0	5
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving	Jun 20, 2024	Automated Theorem ProvingProgram Synthesis	CodeCode Available	0	5
Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4	Sep 9, 2024	Abstract AlgebraAutomated Theorem Proving	CodeCode Available	0	5
MIRB: Mathematical Information Retrieval Benchmark	May 21, 2025	Automated Theorem ProvingInformation Retrieval	CodeCode Available	0	5
Lectures on Jacques Herbrand as a Logician	Feb 26, 2009	Automated Theorem ProvingFormal Logic	CodeCode Available	0	5
Learning to Prove Theorems via Interacting with Proof Assistants	May 21, 2019	Automated Theorem ProvingMathematical Proofs	CodeCode Available	0	5
Aplib: Tactical Programming of Intelligent Agents	Nov 12, 2019	Automated Theorem Proving	CodeCode Available	0	5
Lemmas: Generation, Selection, Application	Mar 10, 2023	Automated Theorem Proving	CodeCode Available	0	5
Learning Symbolic Rules for Reasoning in Quasi-Natural Language	Nov 23, 2021	Automated Theorem ProvingFormal Logic	CodeCode Available	0	5
Learning Rules Explaining Interactive Theorem Proving Tactic Prediction	Nov 2, 2024	Automated Theorem ProvingInductive logic programming	CodeCode Available	0	5
G2SAT: Learning to Generate SAT Formulas	Oct 29, 2019	Automated Theorem Proving	CodeCode Available	0	5
GamePad: A Learning Environment for Theorem Proving	Jun 2, 2018	Automated Theorem ProvingPosition	CodeCode Available	0	5
Learning to Match Mathematical Statements with Proofs	Feb 3, 2021	ArticlesAutomated Theorem Proving	CodeCode Available	0	5
Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math Competitions	May 24, 2025	Automated Theorem ProvingMath	CodeCode Available	0	5
HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving	Mar 1, 2017	Automated Theorem ProvingBIG-bench Machine Learning	CodeCode Available	0	5
Automated proof synthesis for propositional logic with deep neural networks	May 30, 2018	Automated Theorem Proving	CodeCode Available	0	5
HOL(y)Hammer: Online ATP Service for HOL Light	Sep 19, 2013	Automated Theorem ProvingCPU	CodeCode Available	0	5
Holophrasm: a neural Automated Theorem Prover for higher-order logic	Aug 8, 2016	Automated Theorem Proving	CodeCode Available	0	5
Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling	Nov 15, 2019	Automated Theorem ProvingDeep Learning	CodeCode Available	0	5
Automated Completion of Statements and Proofs in Synthetic Geometry: an Approach based on Constraint Solving	Jan 22, 2024	Automated Theorem Proving	CodeCode Available	0	5
Hierarchical Attention Generates Better Proofs	Apr 27, 2025	Automated Theorem ProvingMathematical Proofs	CodeCode Available	0	5
Guiding Inferences in Connection Tableau by Recurrent Neural Networks	May 20, 2019	Automated Theorem ProvingMachine Translation	CodeCode Available	0	5

Show:10 25 50

← PrevPage 2 of 6Next →

All datasets miniF2F-test miniF2F-valid HolStep (Conditional)HOList benchmark HolStep (Unconditional)Metamath set.mm miniF2F-curriculum CompCert CoqGym

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Kimina-Prover-Preview	cumulative	80.74	—	Unverified
2	ProofAug	cumulative	66	—	Unverified
3	DeepSeek-Prover-V1.5	cumulative	63.5	—	Unverified
4	Subgoal-XL	cumulative	56.1	—	Unverified
5	DeepSeek-Prover	cumulative	52	—	Unverified
6	Lyra + GPT-4	cumulative	47.1	—	Unverified
7	LEGO-Prover ChatGPT	cumulative	47.1	—	Unverified
8	Decomposing the Enigma	cumulative	45.5	—	Unverified
9	Evariste	cumulative	41	—	Unverified
10	Evariste-7d	cumulative	40.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@64	58.6	—	Unverified
2	LEGO-Prover ChatGPT	Pass@100	57	—	Unverified
3	Lyra + GPT-4	Pass@100	52	—	Unverified
4	Evariste-7d	Pass@64	47.5	—	Unverified
5	GPT-f	Pass@64	47.3	—	Unverified
6	Evariste-1d	Pass@64	46.7	—	Unverified
7	DSP (62B Minerva informal)	Pass@100	43.9	—	Unverified
8	Lean GPT-f	Pass@8	29.3	—	Unverified
9	Lean tidy	Pass@1	16.8	—	Unverified
10	Metamath GPT-f	Pass@8	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MPNN-DagLSTM	Classification Accuracy	0.92	—	Unverified
2	FormulaNet	Classification Accuracy	0.9	—	Unverified
3	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
4	Siamese 1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified
5	Siamese 1D CNN	Classification Accuracy	0.82	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	4-hop GNN, sub-expression sharing	Percentage correct	49.95	—	Unverified
2	Tactic Dependent Loop	Percentage correct	38.88	—	Unverified
3	BoW2 (extra -ves)	Percentage correct	36.55	—	Unverified
4	Deeper Wider WaveNet	Percentage correct	32.65	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FormulaNet	Classification Accuracy	0.9	—	Unverified
2	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
3	1D CNN	Classification Accuracy	0.83	—	Unverified
4	1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@32	72.4	—	Unverified
2	GPT-f	Percentage correct	56.2	—	Unverified
3	MetaGen-IL + Holophrasm	Percentage correct	22.1	—	Unverified
4	Holophrasm	Percentage correct	14.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste-7d	Pass@64	42.5	—	Unverified
2	Evariste-1d	Pass@64	33.6	—	Unverified
3	Evariste	Pass@64	32.1	—	Unverified
4	GPT-f	Pass@64	30.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Proverbot9001	Percentage correct	19.36	—	Unverified
2	CoqGym/ASTactic	Percentage correct	4.99	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASTactic	Percentage correct	12.2	—	Unverified