SOTAVerified|Agents Browse Leaderboard About

GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 439 papers

Title	Date	Tasks	Status	Hype
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization	Oct 27, 2024	GSM8KHellaSwag	CodeCode Available	1
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation	Oct 27, 2024	GSM8KLanguage Modeling	—Unverified	0
ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning	Oct 24, 2024	GSM8KMath	—Unverified	0
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3
Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment	Oct 23, 2024	GSM8KHumanEval	—Unverified	0
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes	Oct 22, 2024	GSM8KLanguage Modeling	CodeCode Available	1
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation	Oct 22, 2024	GSM8KMath	—Unverified	0
SMART: Self-learning Meta-strategy Agent for Reasoning Tasks	Oct 21, 2024	GSM8KSelf-Learning	CodeCode Available	0
On Designing Effective RL Reward at Training Time for LLM Reasoning	Oct 19, 2024	GSM8KMath	—Unverified	0
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling	Oct 18, 2024	Computational EfficiencyGSM8K	—Unverified	0

Show:10 25 50

← PrevPage 19 of 44Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified