SOTAVerified

GSM8K

Papers

Showing 431439 of 439 papers

TitleStatusHype
Self-Evaluation Guided Beam Search for Reasoning0
Teaching Small Language Models to Reason0
Distilling Reasoning Capabilities into Smaller Language ModelsCode0
Explicit Knowledge Transfer for Weakly-Supervised Code Generation0
Solving math word problems with process- and outcome-based feedback0
Large Language Models Can Self-Improve0
Transcending Scaling Laws with 0.1% Extra Compute0
Complexity-Based Prompting for Multi-Step Reasoning0
Making Large Language Models Better Reasoners with Step-Aware Verifier0
Show:102550
← PrevPage 44 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified