SOTAVerified

GSM8K

Papers

Showing 401410 of 439 papers

TitleStatusHype
AlignedCoT: Prompting Large Language Models via Native-Speaking DemonstrationsCode0
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning0
SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks0
The ART of LLM Refinement: Ask, Refine, and Trust0
Let's Reinforce Step by Step0
Prompt Engineering a Prompt Engineer0
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback0
SEGO: Sequential Subgoal Optimization for Mathematical Problem-SolvingCode0
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning0
DavIR: Data Selection via Implicit Reward for Large Language Models0
Show:102550
← PrevPage 41 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified