SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 341–350 of 439 papers

Title	Date	Tasks	Status	Hype
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements	Feb 13, 2024	GSM8KMath	—Unverified	0
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts	Feb 12, 2024	Continual PretrainingGSM8K	CodeCode Available	2
The Unreasonable Effectiveness of Eccentric Automatic Prompts	Feb 9, 2024	Arithmetic ReasoningGSM8K	—Unverified	0
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning	Feb 9, 2024	Data AugmentationGSM8K	CodeCode Available	4
In-Context Principle Learning from Mistakes	Feb 8, 2024	GSM8KIn-Context Learning	CodeCode Available	0
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning	Feb 8, 2024	GSM8Kreinforcement-learning	CodeCode Available	2
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models	Feb 6, 2024	GSM8KMath	—Unverified	0
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision	Feb 5, 2024	GSM8KMath	—Unverified	0
YODA: Teacher-Student Progressive Learning for Language Models	Jan 28, 2024	GSM8KMath	—Unverified	0
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese	Jan 22, 2024	DiversityGSM8K	CodeCode Available	2

Show:10 25 50

← PrevPage 35 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified