SOTAVerified

GSM8K

Papers

Showing 381390 of 439 papers

TitleStatusHype
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use0
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning0
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts0
System-2 Mathematical Reasoning via Enriched Instruction Tuning0
BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation0
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs0
Teaching Small Language Models to Reason0
Adaptive Decoding via Latent Preference Optimization0
Adapting LLM Agents with Universal Feedback in Communication0
The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback0
Show:102550
← PrevPage 39 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified