Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 1596 papers

Title	Date	Tasks	Status	Hype
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools	Jun 18, 2024	AllGSM8K	CodeCode Available	14
Qwen2.5 Technical Report	Dec 19, 2024	Common Sense Reasoning	CodeCode Available	13
Qwen2.5-Coder Technical Report	Sep 18, 2024	Code Generation	CodeCode Available	11
AgentRxiv: Towards Collaborative Autonomous Research	Mar 23, 2025	Mathscientific discovery	CodeCode Available	9
s1: Simple test-time scaling	Jan 31, 2025	Language ModelingLanguage Modelling	CodeCode Available	9
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model	Sep 3, 2024	DecoderMath	CodeCode Available	9
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models	Feb 5, 2024	Arithmetic ReasoningMath	CodeCode Available	9
EvoAgentX: An Automated Framework for Evolving Agentic Workflows	Jul 4, 2025	Code GenerationMath	CodeCode Available	7
OpenThoughts: Data Recipes for Reasoning Models	Jun 4, 2025	Math	CodeCode Available	7
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning	May 30, 2025	GPUMath	CodeCode Available	7
TTRL: Test-Time Reinforcement Learning	Apr 22, 2025	Mathreinforcement-learning	CodeCode Available	7
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild	Mar 24, 2025	Instruction FollowingMath	CodeCode Available	7
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Mar 17, 2025	MambaMath	CodeCode Available	7
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	Feb 20, 2025	Mathreinforcement-learning	CodeCode Available	7
S*: Test Time Scaling for Code Generation	Feb 20, 2025	Code GenerationMath	CodeCode Available	7
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!	Feb 11, 2025	Large Language ModelMath	CodeCode Available	7
Kimi k1.5: Scaling Reinforcement Learning with LLMs	Jan 22, 2025	Mathreinforcement-learning	CodeCode Available	7
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking	Jan 8, 2025	Math	CodeCode Available	7
O1 Replication Journey: A Strategic Progress Report -- Part 1	Oct 8, 2024	Mathscientific discovery	CodeCode Available	7
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	7
StarCoder 2 and The Stack v2: The Next Generation	Feb 29, 2024	Code CompletionCode Generation	CodeCode Available	7
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines	Oct 5, 2023	Language ModelingLanguage Modelling	CodeCode Available	7
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models	May 6, 2023	Math	CodeCode Available	7
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
Qwen Technical Report	Sep 28, 2023	Language ModelingLanguage Modelling	CodeCode Available	6
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration	Jun 1, 2023	Autonomous DrivingCloud Computing	CodeCode Available	6
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Jan 28, 2022	Common Sense ReasoningGSM8K	CodeCode Available	6
Energy-Based Transformers are Scalable Learners and Thinkers	Jul 2, 2025	DenoisingImage Denoising	VerifiedCommunity Verified — 1 reproduction	5
Reinforcement Learning from Human Feedback	Apr 16, 2025	MathPhilosophy	CodeCode Available	5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models	Mar 9, 2025	MathMultimodal Reasoning	CodeCode Available	5
LIMO: Less is More for Reasoning	Feb 5, 2025	MathMathematical Reasoning	CodeCode Available	5
Process Reinforcement through Implicit Rewards	Feb 3, 2025	MathReinforcement Learning (RL)	CodeCode Available	5
Free Process Rewards without Process Labels	Dec 2, 2024	Math	CodeCode Available	5
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	Jun 11, 2024	Decision MakingGSM8K	CodeCode Available	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5
Evolutionary Optimization of Model Merging Recipes	Mar 19, 2024	Evolutionary AlgorithmsMath	CodeCode Available	5
Common 7B Language Models Already Possess Strong Math Capabilities	Mar 7, 2024	GSM8KMath	CodeCode Available	5
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct	Aug 18, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	5
Skywork Open Reasoner 1 Technical Report	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	4
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision	May 19, 2025	MathMathematical Reasoning	CodeCode Available	4
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset	Apr 23, 2025	MathMathematical Reasoning	CodeCode Available	4
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	Feb 11, 2025	Code GenerationMath	CodeCode Available	4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates	Feb 10, 2025	Hierarchical Reinforcement LearningLanguage Modeling	CodeCode Available	4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4

Show:10 25 50

← PrevPage 1 of 32Next →

No leaderboard results yet.