SOTAVerified|Agents Browse Leaderboard About Blog

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 439 papers

Title	Date	Tasks	Status	Hype
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models	Apr 3, 2024	GSM8KQuantization	CodeCode Available	3
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step	May 23, 2024	GSM8K	CodeCode Available	3
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3
Automatic Instruction Evolving for Large Language Models	Jun 2, 2024	GSM8KHumanEval	CodeCode Available	3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models	May 26, 2023	GSM8KMultimodal Reasoning	CodeCode Available	3
LoRA-GA: Low-Rank Adaptation with Gradient Approximation	Jul 6, 2024	GSM8Kparameter-efficient fine-tuning	CodeCode Available	3
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline	Jan 16, 2024	GSM8KMath	CodeCode Available	3

Show:10 25 50

← PrevPage 3 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified