SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 361–370 of 439 papers

Title	Date	Tasks	Status	Hype
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning	Dec 14, 2023	Arithmetic ReasoningFew-Shot Learning	—Unverified	0
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations	Dec 14, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1
Training Chain-of-Thought via Latent-Variable Inference	Nov 28, 2023	GSM8K	—Unverified	0
AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations	Nov 22, 2023	Common Sense ReasoningGSM8K	CodeCode Available	0
Meta Prompting for AI Systems	Nov 20, 2023	Data InteractionGSM8K	CodeCode Available	2
Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization	Nov 17, 2023	ARCGSM8K	CodeCode Available	1
OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning	Nov 16, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1
Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs	Nov 16, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning	Nov 14, 2023	GSM8KMath	—Unverified	0
The ART of LLM Refinement: Ask, Refine, and Trust	Nov 14, 2023	Arithmetic ReasoningGSM8K	—Unverified	0

Show:10 25 50

← PrevPage 37 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified