SOTAVerified

Math

Papers

Showing 2650 of 1596 papers

TitleStatusHype
Qwen Technical ReportCode6
AWQ: Activation-aware Weight Quantization for LLM Compression and AccelerationCode6
GPT-4 Technical ReportCode6
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsCode6
Reinforcement Learning from Human FeedbackCode5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language ModelsCode5
LIMO: Less is More for ReasoningCode5
Process Reinforcement through Implicit RewardsCode5
Free Process Rewards without Process LabelsCode5
OpenR: An Open Source Framework for Advanced Reasoning with Large Language ModelsCode5
LiveBench: A Challenging, Contamination-Limited LLM BenchmarkCode5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8BCode5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkitCode5
Evolutionary Optimization of Model Merging RecipesCode5
Common 7B Language Models Already Possess Strong Math CapabilitiesCode5
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructCode5
Energy-Based Transformers are Scalable Learners and ThinkersCode4
Skywork Open Reasoner 1 Technical ReportCode4
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level SupervisionCode4
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning datasetCode4
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and BeyondCode4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output PredictionCode4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought TemplatesCode4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN ProblemsCode4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven InsightsCode4
Show:102550
← PrevPage 2 of 64Next →

No leaderboard results yet.