SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 331–340 of 439 papers

Title	Date	Tasks	Status	Hype
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models	Feb 24, 2024	GSM8KMathematical Reasoning	—Unverified	0
Fine-Grained Self-Endorsement Improves Factuality and Reasoning	Feb 23, 2024	GSM8KLanguage Modeling	—Unverified	0
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation	Feb 21, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	1
SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning	Feb 20, 2024	Arithmetic ReasoningGSM8K	—Unverified	0
Reformatted Alignment	Feb 19, 2024	GSM8KHallucination	CodeCode Available	2
Orca-Math: Unlocking the potential of SLMs in Grade School Math	Feb 16, 2024	Arithmetic ReasoningGSM8K	—Unverified	0
Language Models as Science Tutors	Feb 16, 2024	GSM8KMath	CodeCode Available	1
Can Separators Improve Chain-of-Thought Prompting?	Feb 16, 2024	8kGSM8K	—Unverified	0
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset	Feb 15, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	4
Premise Order Matters in Reasoning with Large Language Models	Feb 14, 2024	GSM8KMathematical Problem-Solving	—Unverified	0

Show:10 25 50

← PrevPage 34 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified