SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 439 papers

Title	Date	Tasks	Status	Hype
Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations	Oct 31, 2023	GSM8KMath	CodeCode Available	1
Learning From Mistakes Makes LLM Better Reasoner	Oct 31, 2023	GSM8KMath	CodeCode Available	1
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models	Oct 10, 2023	Code GenerationContinual Learning	CodeCode Available	1
Design of Chain-of-Thought in Math Problem Solving	Sep 20, 2023	DiversityGSM8K	CodeCode Available	1
Large Language Models as Optimizers	Sep 7, 2023	GSM8K	CodeCode Available	1
AskIt: Unified Programming Interface for Programming with Large Language Models	Aug 29, 2023	Code GenerationFew-Shot Learning	CodeCode Available	1
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning	Aug 1, 2023	GSM8KMath	CodeCode Available	1
Matrix Information Theory for Self-Supervised Learning	May 27, 2023	Contrastive LearningGSM8K	CodeCode Available	1
GRACE: Discriminator-Guided Chain-of-Thought Reasoning	May 24, 2023	GSM8KMath	CodeCode Available	1
Automatic Model Selection with Large Language Models for Reasoning	May 23, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1

Show:10 25 50

← PrevPage 15 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified