SOTAVerified

GSM8K

Papers

Showing 171180 of 439 papers

TitleStatusHype
Adaptive Rectification Sampling for Test-Time Compute ScalingCode0
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context LearningCode0
Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language ModelsCode0
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay PerspectiveCode0
Re-Initialization Token Learning for Tool-Augmented Large Language ModelsCode0
Scaling Speculative Decoding with Lookahead ReasoningCode0
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration PitfallsCode0
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuningCode0
Can LLMs Reason in the Wild with Programs?Code0
NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language ModelsCode0
Show:102550
← PrevPage 18 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified