SOTAVerified

GSM8K

Papers

Showing 3140 of 439 papers

TitleStatusHype
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language ModelsCode3
PAL: Program-aided Language ModelsCode3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language ModelsCode3
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by StepCode3
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference LearningCode3
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical ReasoningCode3
SkyMath: Technical ReportCode3
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Show:102550
← PrevPage 4 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified