SOTAVerified

Logical Reasoning

Papers

Showing 110 of 747 papers

TitleStatusHype
FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models0
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning0
Discrete JEPA: Learning Discrete Token Representations without Reconstruction0
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making0
SoundMind: RL-Incentivized Logic Reasoning for Audio-Language ModelsCode5
Motion-R1: Chain-of-Thought Reasoning and Reinforcement Learning for Human Motion Generation0
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving0
TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games0
EviNet: Evidential Reasoning Network for Resilient Graph Learning in the Open and Noisy EnvironmentsCode0
Are LLMs Reliable Translators of Logical Reasoning Across Lexically Diversified Contexts?Code0
Show:102550
← PrevPage 1 of 75Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot, k=3, Direct)Accuracy64.8Unverified
2PaLM 2 (few-shot, k=3, CoT)Accuracy57.2Unverified
3OPT 66B (few-shot, k=3)Accuracy54Unverified
4PaLM 540B (few-shot, k=3)Accuracy53.6Unverified
5BLOOM 176B (few-shot, k=3)Accuracy52.8Unverified
6GPT-NeoX 20B (few-shot, k=3)Accuracy52.8Unverified
7Chinchilla-70B (few-shot, k=5)Accuracy52.1Unverified
8Bloomberg GPT 50B (few-shot, k=3)Accuracy50.8Unverified
9Gopher-280B (few-shot, k=5)Accuracy50.7Unverified