SOTAVerified

GSM8K

Papers

Showing 351360 of 439 papers

TitleStatusHype
System-2 Mathematical Reasoning via Enriched Instruction Tuning0
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs0
Teaching Small Language Models to Reason0
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback0
The ART of LLM Refinement: Ask, Refine, and Trust0
The Role of Deductive and Inductive Reasoning in Large Language Models0
The Unreasonable Effectiveness of Eccentric Automatic Prompts0
Think before you speak: Training Language Models With Pause Tokens0
Think Beyond Size: Adaptive Prompting for More Effective Reasoning0
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs0
Show:102550
← PrevPage 36 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified