SOTAVerified

GSM8K

Papers

Showing 251300 of 439 papers

TitleStatusHype
Iterative Reasoning Preference Optimization0
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning0
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?0
Kwai-STaR: Transform LLMs into State-Transition Reasoners0
KwaiYiiMath: Technical Report0
Large Language Models as Analogical Reasoners0
Large Language Models Can Self-Improve0
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge0
LearnAlign: Reasoning Data Selection for Reinforcement Learning in Large Language Models Based on Improved Gradient Alignment0
Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision0
Learning to Reason via Self-Iterative Process Feedback for Small Language Models0
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint0
Let's Reinforce Step by Step0
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning0
Leveraging Uncertainty Estimation for Efficient LLM Routing0
LiteSearch: Efficacious Tree Search for LLM0
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ0
Meaning-Typed Programming: Language Abstraction and Runtime for Model-Integrated Applications0
DavIR: Data Selection via Implicit Reward for Large Language Models0
Local Prompt Optimization0
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems0
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models0
LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing0
MALT: Improving Reasoning with Multi-Agent LLM Training0
MAmmoTH2: Scaling Instructions from the Web0
MathAttack: Attacking Large Language Models Towards Math Solving Ability0
MathDivide: Improved mathematical reasoning by large language models0
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task0
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs0
Maximizing Confidence Alone Improves Reasoning0
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients0
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving0
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs0
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time0
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs0
Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference0
Model Unlearning via Sparse Autoencoder Subspace Guided Projections0
Multi-Reference Preference Optimization for Large Language Models0
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision0
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning0
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function0
Nudging: Inference-time Alignment of LLMs via Guided Decoding0
On Designing Effective RL Reward at Training Time for LLM Reasoning0
Making Large Language Models Better Reasoners with Step-Aware Verifier0
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation0
Orca-Math: Unlocking the potential of SLMs in Grade School Math0
PARAMANU-GANITA: Language Model with Mathematical Capabilities0
Patience Is The Key to Large Language Model Reasoning0
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation0
Show:102550
← PrevPage 6 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified