SOTAVerified

Logical Reasoning

Papers

Showing 551600 of 747 papers

TitleStatusHype
Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained language models0
Unlocking Temporal Question Answering for Large Language Models with Tailor-Made Reasoning LogicCode0
Deduction under Perturbed Evidence: Probing Student Simulation Capabilities of Large Language Models0
Exploring Self-supervised Logic-enhanced Training for Large Language ModelsCode0
Query Structure Modeling for Inductive Logical Reasoning Over Knowledge GraphsCode0
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization0
Teaching Probabilistic Logical Reasoning to TransformersCode0
Atomic Inference for NLI with Generated Facts as AtomsCode0
Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs0
A Simple Generative Model of Logical Reasoning and Statistical Learning0
Knowledge Authoring for Rules and Actions0
Scalable Coupling of Deep Learning with Logical ReasoningCode0
Tackling Universal Properties of Minimal Trap Spaces of Boolean NetworksCode0
A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex TextCode0
The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples0
Sequential Recommendation with Probabilistic Logical ReasoningCode0
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT0
LeafAI: query generator for clinical cohort discovery rivaling a human programmer0
Scallop: A Language for Neurosymbolic Programming0
Deep Manifold Learning for Reading Comprehension and Logical Reasoning Tasks with Polytuplet LossCode0
BloombergGPT: A Large Language Model for Finance0
Logical Reasoning over Natural Language as Knowledge Representation: A SurveyCode0
Weakly Supervised Knowledge Transfer with Probabilistic Logical Reasoning for Object DetectionCode0
Attribution-Scores and Causal Counterfactuals as Explanations in Artificial Intelligence0
A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT0
Double Equivariance for Inductive Link Prediction for Both New Nodes and New Relation TypesCode0
Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning0
A separation logic for sequences in pointer programs and its decidability0
CogReact: A Reinforced Framework to Model Human Cognitive Reaction Modulated by Dynamic Intervention0
LAMBADA: Backward Chaining for Automated Reasoning in Natural Language0
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning0
Towards High-Order Complementary Recommendation via Logical Reasoning NetworkCode0
Weisfeiler and Leman Go RelationalCode0
Neuro-Symbolic Spatio-Temporal Reasoning0
Logical Tasks for Measuring Extrapolation and Rule ComprehensionCode0
Evident: a Development Methodology and a Knowledge Base Topology for Data Mining, Machine Learning and General Knowledge Management0
Zero-Shot Classification by Logical Reasoning on Natural Language ExplanationsCode0
GammaE: Gamma Embeddings for Logical Queries on Knowledge GraphsCode0
TAPE: Assessing Few-shot Russian Language UnderstandingCode0
MetaLogic: Logical Reasoning Explanations with Fine-Grained StructureCode0
Investigating the Robustness of Natural Language Generation from Logical Forms via Counterfactual SamplesCode0
Inductive Logical Query Answering in Knowledge GraphsCode0
Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer0
To What Extent Do Natural Language Understanding Datasets Correlate to Logical Reasoning? A Method for Diagnosing Logical Reasoning.0
Document-level Biomedical Relation Extraction Based on Multi-Dimensional Fusion Information and Multi-Granularity Logical ReasoningCode0
Type-dependent Prompt CycleQAG : Cycle Consistency for Multi-hop Question Generation0
Towards Human-Compatible XAI: Explaining Data Differentials with Concept Induction over Background Knowledge0
Time-aware Self-Attention Meets Logic Reasoning in Recommender Systems0
Knowledge-based and Data-driven Reasoning and Learning for Ad Hoc Teamwork0
A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables0
Show:102550
← PrevPage 12 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Claude OpusDelta_NoContext28.8Unverified
2GPT-4oDelta_NoContext25.1Unverified
3Gemini 1.5 ProDelta_NoContext23.4Unverified
4GPT-4Delta_NoContext21.5Unverified
5Command R+Delta_NoContext11.6Unverified
6GPT-3.5Delta_NoContext11.2Unverified
7Mixtral 8x7BDelta_NoContext6.4Unverified
8Llama 3 8BDelta_NoContext4.9Unverified
9Llama 3 70BDelta_NoContext2.9Unverified
10Gemma 7BDelta_NoContext2.2Unverified
#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, Direct)Accuracy64.8Unverified
2PaLM 2 (few-shot, k=3, CoT)Accuracy57.2Unverified
3OPT 66B (few-shot, k=3)Accuracy54Unverified
4PaLM 540B (few-shot, k=3)Accuracy53.6Unverified
5GPT-NeoX 20B (few-shot, k=3)Accuracy52.8Unverified
6BLOOM 176B (few-shot, k=3)Accuracy52.8Unverified
7Chinchilla-70B (few-shot, k=5)Accuracy52.1Unverified
8Bloomberg GPT 50B (few-shot, k=3)Accuracy50.8Unverified
9Gopher-280B (few-shot, k=5)Accuracy50.7Unverified
#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, CoT)Accuracy84.9Unverified
2PaLM 2 (few-shot, k=3, Direct)Accuracy65.8Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy48.7Unverified
4PaLM 540B (few-shot, k=3)Accuracy44.5Unverified
5Gopher-280B (few-shot, k=5)Accuracy40.6Unverified
6BLOOM 176B (few-shot, k=3)Accuracy40.41Unverified
7Bloomberg GPT (few-shot, k=3)Accuracy37.67Unverified
8GPT-NeoX (few-shot, k=3)Accuracy33.56Unverified
9OPT 66B (few-shot, k=3)Accuracy28.08Unverified
#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, CoT)Accuracy91.2Unverified
2PaLM 2 (few-shot, k=3, Direct)Accuracy61.2Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy59.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy49.2Unverified
5PaLM 540B (few-shot, k=3)Accuracy38Unverified
6BLOOM 176B (few-shot, k=3)Accuracy36.8Unverified
7Bloomberg GPT (few-shot, k=3)Accuracy34.8Unverified
8OPT 66B (few-shot, k=3)Accuracy31.2Unverified
9GPT-NeoX (few-shot, k=3)Accuracy26Unverified
#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, CoT)Accuracy100Unverified
2PaLM 2 (few-shot, k=3, Direct)Accuracy96.4Unverified
3PaLM 540B (few-shot, k=3)Accuracy39.6Unverified
4BLOOM 176B (few-shot, k=3)Accuracy36.8Unverified
5Chinchilla-70B (few-shot, k=5)Accuracy32Unverified
6Bloomberg GPT (few-shot, k=3)Accuracy29.2Unverified
7OPT 66B (few-shot, k=3)Accuracy23.6Unverified
8GPT-NeoX (few-shot, k=3)Accuracy21.2Unverified
9Gopher-280B (few-shot, k=5)Accuracy19Unverified
#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy44Unverified
2PaLM-540B (few-shot, k=5)Accuracy42.4Unverified
3PaLM-62B (few-shot, k=5)Accuracy36.5Unverified
4Gopher-280B (few-shot, k=5)Accuracy35.1Unverified
#ModelMetricClaimedVerifiedStatus
1PaLM-540B (few-shot, k=5)Accuracy73.9Unverified
2Chinchilla-70B (few-shot, k=5)Accuracy68.3Unverified
3PaLM-62B (few-shot, k=5)Accuracy65.4Unverified
4Gopher-280B (few-shot, k=5)Accuracy61Unverified
#ModelMetricClaimedVerifiedStatus
1Human benchmarkAccuracy 83.7Unverified
2RuGPT-3 LargeAccuracy 40.7Unverified
3RuGPT-3 MediumAccuracy 38Unverified
4RuGPT-3 SmallAccuracy 34Unverified
#ModelMetricClaimedVerifiedStatus
1Human benchmarkAccuracy87Unverified
2RuGPT-3 SmallAccuracy57.9Unverified
3RuGPT-3 MediumAccuracy57.2Unverified
4RuGPT-3 LargeAccuracy55.5Unverified
#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy72.1Unverified
2Gopher-280B (few-shot, k=5)Accuracy58.9Unverified