SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 439 papers

Title	Date	Tasks	Status	Hype
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models	Apr 3, 2024	GSM8KQuantization	CodeCode Available	3
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline	Jan 16, 2024	GSM8KMath	CodeCode Available	3
SkyMath: Technical Report	Oct 25, 2023	GSM8KLanguage Modeling	CodeCode Available	3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models	May 26, 2023	GSM8KMultimodal Reasoning	CodeCode Available	3
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3
Training Verifiers to Solve Math Word Problems	Oct 27, 2021	GSM8KMath	CodeCode Available	3
any4: Learned 4-bit Numeric Representation for LLMs	Jul 7, 2025	GPUGSM8K	CodeCode Available	2
Let LLMs Break Free from Overthinking via Self-Braking Tuning	May 20, 2025	GSM8K	CodeCode Available	2
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space	May 19, 2025	GSM8KMath	CodeCode Available	2

Show:10 25 50

← PrevPage 4 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified