SOTAVerified

GSM8K

Papers

Showing 226250 of 439 papers

TitleStatusHype
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs0
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion0
Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping0
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification0
Maximizing Confidence Alone Improves Reasoning0
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients0
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving0
Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach0
Improve Mathematical Reasoning in Language Models by Automated Process Supervision0
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs0
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time0
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs0
Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference0
Model Unlearning via Sparse Autoencoder Subspace Guided Projections0
Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization0
Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation0
Multi-Reference Preference Optimization for Large Language Models0
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision0
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements0
GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems0
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference0
From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting0
From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education0
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute0
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning0
Show:102550
← PrevPage 10 of 18Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified