SOTAVerified

GSM8K

Papers

Showing 3140 of 439 papers

TitleStatusHype
LayerSkip: Enabling Early Exit Inference and Self-Speculative DecodingCode3
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language ModelsCode3
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible PipelineCode3
SkyMath: Technical ReportCode3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language ModelsCode3
PAL: Program-aided Language ModelsCode3
Training Verifiers to Solve Math Word ProblemsCode3
any4: Learned 4-bit Numeric Representation for LLMsCode2
Let LLMs Break Free from Overthinking via Self-Braking TuningCode2
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent SpaceCode2
Show:102550
← PrevPage 4 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified