SOTAVerified

Math

Papers

Showing 376400 of 1596 papers

TitleStatusHype
CER: Confidence Enhanced Reasoning in LLMsCode0
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay PerspectiveCode0
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics0
SIFT: Grounding LLM Reasoning in Contexts via StickersCode2
BeamLoRA: Beam-Constraint Low-Rank Adaptation0
DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation0
The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationCode0
Reasoning with Reinforced Functional Token TuningCode1
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization0
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees0
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks0
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement LearningCode2
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions0
Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation0
Thinking Preference OptimizationCode1
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task0
Scaling Test-Time Compute Without Verification or RL is Suboptimal0
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving0
Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption0
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding0
A Study on Leveraging Search and Self-Feedback for Agent Reasoning0
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge DistillationCode0
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models0
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQLCode1
Show:102550
← PrevPage 16 of 64Next →

No leaderboard results yet.