SOTAVerified

GSM8K

Papers

Showing 431439 of 439 papers

TitleStatusHype
Transcending Scaling Laws with 0.1% Extra Compute0
Language Models are Multilingual Chain-of-Thought ReasonersCode2
Complexity-Based Prompting for Multi-Step Reasoning0
Making Large Language Models Better Reasoners with Step-Aware Verifier0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Large Language Models are Zero-Shot ReasonersCode2
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsCode1
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsCode6
Training Verifiers to Solve Math Word ProblemsCode3
Show:102550
← PrevPage 44 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified