SOTAVerified

Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Showing 201250 of 288 papers

TitleStatusHype
Logical Neural NetworksCode1
Learning to Prove from Synthetic Theorems0
Towards United Reasoning for Automatic Induction in Isabelle/HOL0
Logical Inferences with Comparatives and Generalized QuantifiersCode1
Towards Concise, Machine-discovered Proofs of Gödel's Two Incompleteness Theorems0
Simple Dataset for Proof Method Recommendation in Isabelle/HOL (Dataset Description)0
Prolog Technology Reinforcement Learning ProverCode1
Towards a Geometry Automated Provers Competition0
Proceedings 8th International Workshop on Theorem Proving Components for Educational Software0
Learning to Prove Theorems by Learning to Generate TheoremsCode1
OxKBC: Outcome Explanation for Factorization Based Knowledge Base CompletionCode0
An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic0
On Quantified Modal Theorem Proving for Modeling Ethics0
Property Invariant Embedding for Automated Reasoning0
Improving Graph Neural Network Representations of Logical Formulae with Subgraph PoolingCode0
Aplib: Tactical Programming of Intelligent AgentsCode0
A Deep Reinforcement Learning Approach to First-Order Logic Theorem ProvingCode1
G2SAT: Learning to Generate SAT FormulasCode0
Deep Reinforcement Learning for Synthesizing Functions in Higher-Order LogicCode0
CSPLib: Twenty Years On0
Proceedings 35th International Conference on Logic Programming (Technical Communications)0
Designing Game of Theorems0
Neural Theorem Provers Do Not Learn Rules Without ExplorationCode0
Multimodal Logical Inference System for Visual-Textual Entailment0
Towards Finding Longer ProofsCode0
NIL: Learning Nonlinear Interpolants0
Learning to Reason in Large Theories without Imitation0
Graph Representations for Higher-Order Logic and Theorem Proving0
Learning to Prove Theorems via Interacting with Proof AssistantsCode0
Guiding Inferences in Connection Tableau by Recurrent Neural NetworksCode0
Scalable Neural Theorem Proving on Knowledge Bases and Natural Language0
Towards Evolutionary Theorem Proving for Isabelle/HOL0
HOList: An Environment for Machine Learning of Higher-Order Theorem ProvingCode0
Hammering Mizar by Learning Clause Guidance0
Proceedings 7th International Workshop on Theorem proving components for Educational software0
ENIGMA-NG: Efficient Neural and Gradient-Boosted Inference Guidance for E0
Review on DNA Strand Algebra and its Application0
Towards Machine Learning Induction0
Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning0
Learning to Reason0
Logical Rule Induction and Theory Learning Using Neural Theorem Proving0
Towards Neural Theorem Proving at Scale0
ML + FV = ? A Survey on the Application of Machine Learning to Formal Verification0
GamePad: A Learning Environment for Theorem ProvingCode0
Automated proof synthesis for propositional logic with deep neural networksCode0
Reinforcement Learning of Theorem Proving0
Cross-checking WordNet and SUMO Using Meronymy0
Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning0
Improving QED-Tutrix by Automating the Generation of Proofs0
Proceedings 6th International Workshop on Theorem proving components for Educational software0
Show:102550
← PrevPage 5 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Kimina-Prover-Previewcumulative80.74Unverified
2ProofAugcumulative66Unverified
3DeepSeek-Prover-V1.5cumulative63.5Unverified
4Subgoal-XLcumulative56.1Unverified
5DeepSeek-Provercumulative52Unverified
6Lyra + GPT-4cumulative47.1Unverified
7LEGO-Prover ChatGPTcumulative47.1Unverified
8Decomposing the Enigmacumulative45.5Unverified
9Evaristecumulative41Unverified
10Evariste-7dcumulative40.6Unverified
#ModelMetricClaimedVerifiedStatus
1EvaristePass@6458.6Unverified
2LEGO-Prover ChatGPTPass@10057Unverified
3Lyra + GPT-4Pass@10052Unverified
4Evariste-7dPass@6447.5Unverified
5GPT-fPass@6447.3Unverified
6Evariste-1dPass@6446.7Unverified
7DSP (62B Minerva informal)Pass@10043.9Unverified
8Lean GPT-fPass@829.3Unverified
9Lean tidyPass@116.8Unverified
10Metamath GPT-fPass@82Unverified
#ModelMetricClaimedVerifiedStatus
1MPNN-DagLSTMClassification Accuracy0.92Unverified
2FormulaNetClassification Accuracy0.9Unverified
3FormulaNet-basicClassification Accuracy0.89Unverified
4Siamese 1D CNN-LSTMClassification Accuracy0.83Unverified
5Siamese 1D CNNClassification Accuracy0.82Unverified
#ModelMetricClaimedVerifiedStatus
14-hop GNN, sub-expression sharingPercentage correct49.95Unverified
2Tactic Dependent LoopPercentage correct38.88Unverified
3BoW2 (extra -ves)Percentage correct36.55Unverified
4Deeper Wider WaveNetPercentage correct32.65Unverified
#ModelMetricClaimedVerifiedStatus
1FormulaNetClassification Accuracy0.9Unverified
2FormulaNet-basicClassification Accuracy0.89Unverified
31D CNN-LSTMClassification Accuracy0.83Unverified
41D CNNClassification Accuracy0.83Unverified
#ModelMetricClaimedVerifiedStatus
1EvaristePass@3272.4Unverified
2GPT-fPercentage correct56.2Unverified
3MetaGen-IL + HolophrasmPercentage correct22.1Unverified
4HolophrasmPercentage correct14.3Unverified
#ModelMetricClaimedVerifiedStatus
1Evariste-7dPass@6442.5Unverified
2Evariste-1dPass@6433.6Unverified
3EvaristePass@6432.1Unverified
4GPT-fPass@6430.6Unverified
#ModelMetricClaimedVerifiedStatus
1Proverbot9001Percentage correct19.36Unverified
2CoqGym/ASTacticPercentage correct4.99Unverified
#ModelMetricClaimedVerifiedStatus
1ASTacticPercentage correct12.2Unverified