SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 439 papers

Title	Date	Tasks	Status	Hype	Score
Automatic Instruction Evolving for Large Language Models	Jun 2, 2024	GSM8KHumanEval	CodeCode Available	3	5
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3	5
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models	Apr 3, 2024	GSM8KQuantization	CodeCode Available	3	5
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3	5
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step	May 23, 2024	GSM8K	CodeCode Available	3	5
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline	Jan 16, 2024	GSM8KMath	CodeCode Available	3	5
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	May 1, 2024	ARCGSM8K	CodeCode Available	3	5
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3	5
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models	May 26, 2023	GSM8KMultimodal Reasoning	CodeCode Available	3	5
LoRA-GA: Low-Rank Adaptation with Gradient Approximation	Jul 6, 2024	GSM8Kparameter-efficient fine-tuning	CodeCode Available	3	5

Show:10 25 50

← PrevPage 3 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified