SOTAVerified|Agents Browse Leaderboard About Blog

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–439 of 439 papers

Title	Date	Tasks	Status	Hype
Distilling Reasoning Capabilities into Smaller Language Models	Dec 1, 2022	GSM8KKnowledge Distillation	CodeCode Available	0
Explicit Knowledge Transfer for Weakly-Supervised Code Generation	Nov 30, 2022	Code GenerationFew-Shot Learning	—Unverified	0
Solving math word problems with process- and outcome-based feedback	Nov 25, 2022	Arithmetic ReasoningGSM8K	—Unverified	0
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3
Large Language Models Can Self-Improve	Oct 20, 2022	Arithmetic ReasoningCommon Sense Reasoning	—Unverified	0
Transcending Scaling Laws with 0.1% Extra Compute	Oct 20, 2022	Arithmetic ReasoningCross-Lingual Question Answering	—Unverified	0
Language Models are Multilingual Chain-of-Thought Reasoners	Oct 6, 2022	GSM8KMath	CodeCode Available	2
Complexity-Based Prompting for Multi-Step Reasoning	Oct 3, 2022	Date UnderstandingGSM8K	—Unverified	0
Making Large Language Models Better Reasoners with Step-Aware Verifier	Jun 6, 2022	Arithmetic ReasoningFew-Shot Learning	—Unverified	0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions	May 28, 2022	Arithmetic ReasoningEfficient Exploration	CodeCode Available	1
Large Language Models are Zero-Shot Reasoners	May 24, 2022	Arithmetic ReasoningCommon Sense Reasoning	CodeCode Available	2
Self-Consistency Improves Chain of Thought Reasoning in Language Models	Mar 21, 2022	ARCArithmetic Reasoning	CodeCode Available	1
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Jan 28, 2022	Common Sense ReasoningGSM8K	CodeCode Available	6
Training Verifiers to Solve Math Word Problems	Oct 27, 2021	GSM8KMath	CodeCode Available	3

Show:10 25 50

← PrevPage 18 of 18Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified