SOTAVerified

Automated Theorem Proving

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Papers

Showing 151200 of 288 papers

TitleStatusHype
Autoformalization with Large Language Models0
HyperTree Proof Search for Neural Theorem Proving0
From Width-Based Model Checking to Width-Based Automated Theorem Proving0
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers0
The Isabelle ENIGMACode0
Logically Consistent Adversarial Attacks for Soft Theorem ProversCode0
Adversarial Learning to Reason in an Arbitrary Logic0
The Proof is in the Pudding: Using Automated Theorem Proving to Generate Cooking RecipesCode0
Automated Reasoning in Non-classical Logics in the TPTP World0
Selection Strategies for Commonsense Knowledge0
From the String Landscape to the Mathematical Landscape: a Machine-Learning Outlook0
Vehicle: Interfacing Neural Network Verifiers with Interactive Theorem Provers0
Formal Mathematics Statement Curriculum LearningCode2
Proceedings 10th International Workshop on Theorem Proving Components for Educational Software0
Proceedings of the 13th International Conference on Automated Deduction in Geometry0
Proving Theorems using Incremental Learning and Hindsight Experience Replay0
Linear algebra with transformersCode1
Learning Symbolic Rules for Reasoning in Quasi-Natural LanguageCode0
Logically Sound Arguments for the Effectiveness of ML Safety Measures0
Applying Second-Order Quantifier Elimination in Inspecting Gödel's Ontological Proof0
Proof Extraction for Logical Neural Networks0
An energy-based model for neuro-symbolic reasoning on knowledge graphsCode1
Linear algebra with transformers0
Neural Unification for Logic Reasoning over Natural LanguageCode1
Proceedings 37th International Conference on Logic Programming (Technical Communications)0
Conjectures, Tests and Proofs: An Overview of Theory Exploration0
AI Descartes: Combining Data and Theory for Derivable Scientific DiscoveryCode1
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematicsCode1
The Horn Non-Clausal Class and its Polynomiality0
ProoFVer: Natural Logic Theorem Proving for Fact VerificationCode1
Graph Contrastive Pre-training for Effective Theorem Reasoning0
Learning Theorem Proving ComponentsCode1
Learning to Guide a Saturation-Based Theorem Prover0
The Role of Entropy in Guiding a Connection Prover0
NaturalProofs: Mathematical Theorem Proving in Natural LanguageCode1
Training a First-Order Theorem Prover from Synthetic Data0
TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning0
Proof Artifact Co-training for Theorem Proving with Language ModelsCode1
Learning Equational Theorem Proving0
Learning to Match Mathematical Statements with ProofsCode0
A Study of Continuous Vector Representationsfor Theorem Proving0
A Curious New Result of Resolution Strategies in Negation-Limited Inverters Problem0
Learning as Abduction: Trainable Natural Logic Theorem Prover for Natural Language InferenceCode1
Proceedings 9th International Workshop on Theorem Proving Components for Educational Software0
Measuring Systematic Generalization in Neural Proof Generation with TransformersCode1
Deriving Theorems in Implicational Linear Logic, Declaratively0
Proceedings 36th International Conference on Logic Programming (Technical Communications)0
Generative Language Modeling for Automated Theorem Proving0
INT: An Inequality Benchmark for Evaluating Generalization in Theorem ProvingCode1
Modelling Value-oriented Legal Reasoning in LogiKEy0
Show:102550
← PrevPage 4 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Kimina-Prover-Previewcumulative80.74Unverified
2ProofAugcumulative66Unverified
3DeepSeek-Prover-V1.5cumulative63.5Unverified
4Subgoal-XLcumulative56.1Unverified
5DeepSeek-Provercumulative52Unverified
6Lyra + GPT-4cumulative47.1Unverified
7LEGO-Prover ChatGPTcumulative47.1Unverified
8Decomposing the Enigmacumulative45.5Unverified
9Evaristecumulative41Unverified
10Evariste-7dcumulative40.6Unverified
#ModelMetricClaimedVerifiedStatus
1EvaristePass@6458.6Unverified
2LEGO-Prover ChatGPTPass@10057Unverified
3Lyra + GPT-4Pass@10052Unverified
4Evariste-7dPass@6447.5Unverified
5GPT-fPass@6447.3Unverified
6Evariste-1dPass@6446.7Unverified
7DSP (62B Minerva informal)Pass@10043.9Unverified
8Lean GPT-fPass@829.3Unverified
9Lean tidyPass@116.8Unverified
10Metamath GPT-fPass@82Unverified
#ModelMetricClaimedVerifiedStatus
1MPNN-DagLSTMClassification Accuracy0.92Unverified
2FormulaNetClassification Accuracy0.9Unverified
3FormulaNet-basicClassification Accuracy0.89Unverified
4Siamese 1D CNN-LSTMClassification Accuracy0.83Unverified
5Siamese 1D CNNClassification Accuracy0.82Unverified
#ModelMetricClaimedVerifiedStatus
14-hop GNN, sub-expression sharingPercentage correct49.95Unverified
2Tactic Dependent LoopPercentage correct38.88Unverified
3BoW2 (extra -ves)Percentage correct36.55Unverified
4Deeper Wider WaveNetPercentage correct32.65Unverified
#ModelMetricClaimedVerifiedStatus
1FormulaNetClassification Accuracy0.9Unverified
2FormulaNet-basicClassification Accuracy0.89Unverified
31D CNNClassification Accuracy0.83Unverified
41D CNN-LSTMClassification Accuracy0.83Unverified
#ModelMetricClaimedVerifiedStatus
1EvaristePass@3272.4Unverified
2GPT-fPercentage correct56.2Unverified
3MetaGen-IL + HolophrasmPercentage correct22.1Unverified
4HolophrasmPercentage correct14.3Unverified
#ModelMetricClaimedVerifiedStatus
1Evariste-7dPass@6442.5Unverified
2Evariste-1dPass@6433.6Unverified
3EvaristePass@6432.1Unverified
4GPT-fPass@6430.6Unverified
#ModelMetricClaimedVerifiedStatus
1Proverbot9001Percentage correct19.36Unverified
2CoqGym/ASTacticPercentage correct4.99Unverified
#ModelMetricClaimedVerifiedStatus
1ASTacticPercentage correct12.2Unverified