SOTAVerified

GSM8K

Papers

Showing 6170 of 439 papers

TitleStatusHype
Exploring the Compositional Deficiency of Large Language Models in Mathematical ReasoningCode2
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained RewardsCode2
any4: Learned 4-bit Numeric Representation for LLMsCode2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language ModelsCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Meta Prompting for AI SystemsCode2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
Show:102550
← PrevPage 7 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified