SOTAVerified

GSM8K

Papers

Showing 181190 of 439 papers

TitleStatusHype
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration PitfallsCode0
Can LLMs Reason in the Wild with Programs?Code0
DIVE: Diversified Iterative Self-ImprovementCode0
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuningCode0
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem SolvingCode0
Distilling Reasoning Capabilities into Smaller Language ModelsCode0
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math ReasoningCode0
NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language ModelsCode0
Discriminative Policy Optimization for Token-Level Reward ModelsCode0
DiscQuant: A Quantization Method for Neural Networks Inspired by Discrepancy TheoryCode0
Show:102550
← PrevPage 19 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified