SOTAVerified

GSM8K

Papers

Showing 276300 of 439 papers

TitleStatusHype
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank AdaptationCode0
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8BCode5
Uncertainty Aware Learning for Language Model Alignment0
Improve Mathematical Reasoning in Language Models by Automated Process Supervision0
Does your data spark joy? Performance gains from domain upsampling at the end of training0
Automatic Instruction Evolving for Large Language ModelsCode3
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM DeploymentCode0
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths0
Arithmetic Reasoning with LLM: Prolog Generation & Permutation0
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
Multi-Reference Preference Optimization for Large Language Models0
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time0
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM TrainingCode7
ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token IdentificationCode1
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-ContrastCode1
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by StepCode3
Multiple-Choice Questions are Efficient and Robust LLM EvaluatorsCode1
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving0
Meaning-Typed Programming: Language Abstraction and Runtime for Model-Integrated Applications0
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical ReasoningCode3
MathDivide: Improved mathematical reasoning by large language models0
MAmmoTH2: Scaling Instructions from the Web0
Exploring the Compositional Deficiency of Large Language Models in Mathematical ReasoningCode2
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference LearningCode3
Show:102550
← PrevPage 12 of 18Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified