SOTAVerified

GSM8K

Papers

Showing 426439 of 439 papers

TitleStatusHype
Distilling Reasoning Capabilities into Smaller Language ModelsCode0
Explicit Knowledge Transfer for Weakly-Supervised Code Generation0
Solving math word problems with process- and outcome-based feedback0
PAL: Program-aided Language ModelsCode3
Large Language Models Can Self-Improve0
Transcending Scaling Laws with 0.1% Extra Compute0
Language Models are Multilingual Chain-of-Thought ReasonersCode2
Complexity-Based Prompting for Multi-Step Reasoning0
Making Large Language Models Better Reasoners with Step-Aware Verifier0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Large Language Models are Zero-Shot ReasonersCode2
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsCode1
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsCode6
Training Verifiers to Solve Math Word ProblemsCode3
Show:102550
← PrevPage 18 of 18Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified