SOTAVerified

GSM8K

Papers

Showing 411420 of 439 papers

TitleStatusHype
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank AdaptationCode0
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context LearningCode0
LLM-TOPLA: Efficient LLM Ensemble by Maximising DiversityCode0
LLM2: Let Large Language Models Harness System 2 ReasoningCode0
COrAL: Order-Agnostic Language Modeling for Efficient Iterative RefinementCode0
Upweighting Easy Samples in Fine-Tuning Mitigates ForgettingCode0
Learning a Continue-Thinking Token for Enhanced Test-Time ScalingCode0
Inference-Time Decontamination: Reusing Leaked Benchmarks for Large Language Model EvaluationCode0
SMART: Self-learning Meta-strategy Agent for Reasoning TasksCode0
Can LLMs Reason in the Wild with Programs?Code0
Show:102550
← PrevPage 42 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified