SOTAVerified

GSM8K

Papers

Showing 2130 of 439 papers

TitleStatusHype
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free ResolutionCode3
TokenSkip: Controllable Chain-of-Thought Compression in LLMsCode3
Scaling up Masked Diffusion Models on TextCode3
Large Language Monkeys: Scaling Inference Compute with Repeated SamplingCode3
LoRA-GA: Low-Rank Adaptation with Gradient ApproximationCode3
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMsCode3
Automatic Instruction Evolving for Large Language ModelsCode3
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by StepCode3
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical ReasoningCode3
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference LearningCode3
Show:102550
← PrevPage 3 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified