SOTAVerified

GSM8K

Papers

Showing 121130 of 439 papers

TitleStatusHype
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language ModelsCode1
Learning From Mistakes Makes LLM Better ReasonerCode1
Boosted Prompt Ensembles for Large Language ModelsCode1
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Design of Chain-of-Thought in Math Problem SolvingCode1
Self-Training Elicits Concise Reasoning in Large Language ModelsCode1
Learning Goal-Conditioned Representations for Language Reward ModelsCode1
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based SamplingCode1
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context LearningCode1
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language ModelsCode1
Show:102550
← PrevPage 13 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified