SOTAVerified

Logical Reasoning

Papers

Showing 110 of 747 papers

TitleStatusHype
FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models0
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning0
Discrete JEPA: Learning Discrete Token Representations without Reconstruction0
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making0
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language ModelsCode5
Motion-R1: Chain-of-Thought Reasoning and Reinforcement Learning for Human Motion Generation0
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving0
TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games0
EviNet: Evidential Reasoning Network for Resilient Graph Learning in the Open and Noisy EnvironmentsCode0
Are LLMs Reliable Translators of Logical Reasoning Across Lexically Diversified Contexts?Code0
Show:102550
← PrevPage 1 of 75Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, CoT)Accuracy84.9Unverified
2PaLM 2 (few-shot, k=3, Direct)Accuracy65.8Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy48.7Unverified
4PaLM 540B (few-shot, k=3)Accuracy44.5Unverified
5Gopher-280B (few-shot, k=5)Accuracy40.6Unverified
6BLOOM 176B (few-shot, k=3)Accuracy40.41Unverified
7Bloomberg GPT (few-shot, k=3)Accuracy37.67Unverified
8GPT-NeoX (few-shot, k=3)Accuracy33.56Unverified
9OPT 66B (few-shot, k=3)Accuracy28.08Unverified