SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 431–439 of 439 papers

Title	Date	Tasks	Status	Hype
Transcending Scaling Laws with 0.1% Extra Compute	Oct 20, 2022	Arithmetic ReasoningCross-Lingual Question Answering	—Unverified	0
Language Models are Multilingual Chain-of-Thought Reasoners	Oct 6, 2022	GSM8KMath	CodeCode Available	2
Complexity-Based Prompting for Multi-Step Reasoning	Oct 3, 2022	Date UnderstandingGSM8K	—Unverified	0
Making Large Language Models Better Reasoners with Step-Aware Verifier	Jun 6, 2022	Arithmetic ReasoningFew-Shot Learning	—Unverified	0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions	May 28, 2022	Arithmetic ReasoningEfficient Exploration	CodeCode Available	1
Large Language Models are Zero-Shot Reasoners	May 24, 2022	Arithmetic ReasoningCommon Sense Reasoning	CodeCode Available	2
Self-Consistency Improves Chain of Thought Reasoning in Language Models	Mar 21, 2022	ARCArithmetic Reasoning	CodeCode Available	1
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Jan 28, 2022	Common Sense ReasoningGSM8K	CodeCode Available	6
Training Verifiers to Solve Math Word Problems	Oct 27, 2021	GSM8KMath	CodeCode Available	3

Show:10 25 50

← PrevPage 44 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified