Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 288 papers

Title	Date	Tasks	Status
Verifier Theory and Unverifiability	Sep 1, 2016	Automated Theorem ProvingGeneral Classification	—Unverified
Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview	Mar 13, 2025	Automated Theorem Provingsoftware testing	—Unverified
0-1 laws for pattern occurrences in phylogenetic trees and networks	Feb 7, 2024	10-shot image generation	—Unverified
Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry	Apr 9, 2024	Automated Theorem ProvingCPU	—Unverified
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation	May 16, 2025	Automated Theorem Proving	—Unverified
3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes	Oct 14, 2024	Automated Theorem ProvingDiversity	—Unverified
A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic	Feb 28, 2024	Automated Theorem ProvingInformation Retrieval	—Unverified
A Certified Proof Checker for Deep Neural Network Verification in Imandra	May 17, 2024	Automated Theorem ProvingLEMMA	—Unverified
A Combinatorial Identities Benchmark for Theorem Proving via Automated Theorem Generation	Feb 25, 2025	Automated Theorem ProvingLanguage Modeling	—Unverified
Activation Steering in Neural Theorem Provers	Feb 21, 2025	Automated Theorem Proving	—Unverified
A Curious New Result of Resolution Strategies in Negation-Limited Inverters Problem	Nov 2, 2020	Automated Theorem ProvingNegation	—Unverified
Adversarial Learning to Reason in an Arbitrary Logic	Apr 6, 2022	Automated Theorem Proving	—Unverified
Analysis of Algorithms and Partial Algorithms	Jan 13, 2016	Automated Theorem Proving	—Unverified
An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic	Feb 2, 2020	Automated Theorem Proving	—Unverified
Anti-unification and Generalization: A Survey	Feb 1, 2023	Automated Theorem ProvingSurvey	—Unverified
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries	Apr 27, 2025	Automated Theorem ProvingBug fixing	—Unverified
APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning	May 9, 2025	Automated Theorem Proving	—Unverified
Applying Second-Order Quantifier Elimination in Inspecting Gödel's Ontological Proof	Oct 21, 2021	Automated Theorem Proving	—Unverified
Artifical intelligence and inherent mathematical difficulty	Aug 1, 2024	Automated Theorem Proving	—Unverified
Artificial Neural Networks that Learn to Satisfy Logic Constraints	Dec 8, 2017	Automated Theorem Proving	—Unverified
A state vector algebra for algorithmic implementation of second-order logic	Dec 9, 2013	Automated Theorem Proving	—Unverified
A Study of Continuous Vector Representationsfor Theorem Proving	Jan 22, 2021	Automated Theorem Proving	—Unverified
ATG: Benchmarking Automated Theorem Generation for Generative Language Models	May 5, 2024	Automated Theorem ProvingBenchmarking	—Unverified
Autoformalization with Large Language Models	May 25, 2022	Automated Theorem ProvingProgram Synthesis	—Unverified
Automated Planning Techniques for Elementary Proofs in Abstract Algebra	Dec 11, 2023	Abstract AlgebraAutomated Theorem Proving	—Unverified

Show:10 25 50

← PrevPage 8 of 12Next →

All datasets miniF2F-test miniF2F-valid HolStep (Conditional)HOList benchmark HolStep (Unconditional)Metamath set.mm miniF2F-curriculum CompCert CoqGym

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Kimina-Prover-Preview	cumulative	80.74	—	Unverified
2	ProofAug	cumulative	66	—	Unverified
3	DeepSeek-Prover-V1.5	cumulative	63.5	—	Unverified
4	Subgoal-XL	cumulative	56.1	—	Unverified
5	DeepSeek-Prover	cumulative	52	—	Unverified
6	Lyra + GPT-4	cumulative	47.1	—	Unverified
7	LEGO-Prover ChatGPT	cumulative	47.1	—	Unverified
8	Decomposing the Enigma	cumulative	45.5	—	Unverified
9	Evariste	cumulative	41	—	Unverified
10	Evariste-7d	cumulative	40.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@64	58.6	—	Unverified
2	LEGO-Prover ChatGPT	Pass@100	57	—	Unverified
3	Lyra + GPT-4	Pass@100	52	—	Unverified
4	Evariste-7d	Pass@64	47.5	—	Unverified
5	GPT-f	Pass@64	47.3	—	Unverified
6	Evariste-1d	Pass@64	46.7	—	Unverified
7	DSP (62B Minerva informal)	Pass@100	43.9	—	Unverified
8	Lean GPT-f	Pass@8	29.3	—	Unverified
9	Lean tidy	Pass@1	16.8	—	Unverified
10	Metamath GPT-f	Pass@8	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MPNN-DagLSTM	Classification Accuracy	0.92	—	Unverified
2	FormulaNet	Classification Accuracy	0.9	—	Unverified
3	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
4	Siamese 1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified
5	Siamese 1D CNN	Classification Accuracy	0.82	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	4-hop GNN, sub-expression sharing	Percentage correct	49.95	—	Unverified
2	Tactic Dependent Loop	Percentage correct	38.88	—	Unverified
3	BoW2 (extra -ves)	Percentage correct	36.55	—	Unverified
4	Deeper Wider WaveNet	Percentage correct	32.65	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FormulaNet	Classification Accuracy	0.9	—	Unverified
2	FormulaNet-basic	Classification Accuracy	0.89	—	Unverified
3	1D CNN	Classification Accuracy	0.83	—	Unverified
4	1D CNN-LSTM	Classification Accuracy	0.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste	Pass@32	72.4	—	Unverified
2	GPT-f	Percentage correct	56.2	—	Unverified
3	MetaGen-IL + Holophrasm	Percentage correct	22.1	—	Unverified
4	Holophrasm	Percentage correct	14.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Evariste-7d	Pass@64	42.5	—	Unverified
2	Evariste-1d	Pass@64	33.6	—	Unverified
3	Evariste	Pass@64	32.1	—	Unverified
4	GPT-f	Pass@64	30.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Proverbot9001	Percentage correct	19.36	—	Unverified
2	CoqGym/ASTactic	Percentage correct	4.99	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASTactic	Percentage correct	12.2	—	Unverified