SOTAVerified

Logical Reasoning

Papers

Showing 110 of 747 papers

TitleStatusHype
FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models0
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning0
Discrete JEPA: Learning Discrete Token Representations without Reconstruction0
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making0
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language ModelsCode5
Motion-R1: Chain-of-Thought Reasoning and Reinforcement Learning for Human Motion Generation0
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving0
TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games0
EviNet: Evidential Reasoning Network for Resilient Graph Learning in the Open and Noisy EnvironmentsCode0
Are LLMs Reliable Translators of Logical Reasoning Across Lexically Diversified Contexts?Code0
Show:102550
← PrevPage 1 of 75Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Claude OpusDelta_NoContext28.8Unverified
2GPT-4oDelta_NoContext25.1Unverified
3Gemini 1.5 ProDelta_NoContext23.4Unverified
4GPT-4Delta_NoContext21.5Unverified
5Command R+Delta_NoContext11.6Unverified
6GPT-3.5Delta_NoContext11.2Unverified
7Mixtral 8x7BDelta_NoContext6.4Unverified
8Llama 3 8BDelta_NoContext4.9Unverified
9Llama 3 70BDelta_NoContext2.9Unverified
10Gemma 7BDelta_NoContext2.2Unverified