Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Energy-Based Transformers are Scalable Learners and Thinkers	Jul 2, 2025	DenoisingImage Denoising	VerifiedCommunity Verified — 1 reproduction	5	18
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools	Jun 18, 2024	AllGSM8K	CodeCode Available	14	5
Qwen2.5 Technical Report	Dec 19, 2024	Common Sense Reasoning	CodeCode Available	13	5
Qwen2.5-Coder Technical Report	Sep 18, 2024	Code Generation	CodeCode Available	11	5
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model	Sep 3, 2024	DecoderMath	CodeCode Available	9	5
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models	Feb 5, 2024	Arithmetic ReasoningMath	CodeCode Available	9	5
s1: Simple test-time scaling	Jan 31, 2025	Language ModelingLanguage Modelling	CodeCode Available	9	5
AgentRxiv: Towards Collaborative Autonomous Research	Mar 23, 2025	Mathscientific discovery	CodeCode Available	9	5
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9	5
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models	May 6, 2023	Math	CodeCode Available	7	5
OpenThoughts: Data Recipes for Reasoning Models	Jun 4, 2025	Math	CodeCode Available	7	5
O1 Replication Journey: A Strategic Progress Report -- Part 1	Oct 8, 2024	Mathscientific discovery	CodeCode Available	7	5
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild	Mar 24, 2025	Instruction FollowingMath	CodeCode Available	7	5
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines	Oct 5, 2023	Language ModelingLanguage Modelling	CodeCode Available	7	5
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking	Jan 8, 2025	Math	CodeCode Available	7	5
Kimi k1.5: Scaling Reinforcement Learning with LLMs	Jan 22, 2025	Mathreinforcement-learning	CodeCode Available	7	5
StarCoder 2 and The Stack v2: The Next Generation	Feb 29, 2024	Code CompletionCode Generation	CodeCode Available	7	5
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!	Feb 11, 2025	Large Language ModelMath	CodeCode Available	7	5
S*: Test Time Scaling for Code Generation	Feb 20, 2025	Code GenerationMath	CodeCode Available	7	5
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	7	5
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Mar 17, 2025	MambaMath	CodeCode Available	7	5
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning	May 30, 2025	GPUMath	CodeCode Available	7	5
EvoAgentX: An Automated Framework for Evolving Agentic Workflows	Jul 4, 2025	Code GenerationMath	CodeCode Available	7	5
TTRL: Test-Time Reinforcement Learning	Apr 22, 2025	Mathreinforcement-learning	CodeCode Available	7	5
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning	Feb 20, 2025	Mathreinforcement-learning	CodeCode Available	7	5
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration	Jun 1, 2023	Autonomous DrivingCloud Computing	CodeCode Available	6	5
Qwen Technical Report	Sep 28, 2023	Language ModelingLanguage Modelling	CodeCode Available	6	5
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6	5
GPT-4 Technical Report	Mar 15, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6	5
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Jan 28, 2022	Common Sense ReasoningGSM8K	CodeCode Available	6	5
Process Reinforcement through Implicit Rewards	Feb 3, 2025	MathReinforcement Learning (RL)	CodeCode Available	5	5
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5	5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Apr 22, 2024	Math	CodeCode Available	5	5
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5	5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	Jun 11, 2024	Decision MakingGSM8K	CodeCode Available	5	5
Common 7B Language Models Already Possess Strong Math Capabilities	Mar 7, 2024	GSM8KMath	CodeCode Available	5	5
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct	Aug 18, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	5	5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models	Mar 9, 2025	MathMultimodal Reasoning	CodeCode Available	5	5
LIMO: Less is More for Reasoning	Feb 5, 2025	MathMathematical Reasoning	CodeCode Available	5	5
Evolutionary Optimization of Model Merging Recipes	Mar 19, 2024	Evolutionary AlgorithmsMath	CodeCode Available	5	5
Free Process Rewards without Process Labels	Dec 2, 2024	Math	CodeCode Available	5	5
Reinforcement Learning from Human Feedback	Apr 16, 2025	MathPhilosophy	CodeCode Available	5	5
Dive into Deep Learning	Jun 21, 2021	Deep LearningMath	CodeCode Available	4	5
LLaMA Pro: Progressive LLaMA with Block Expansion	Jan 4, 2024	Instruction FollowingMath	CodeCode Available	4	5
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems	Jun 6, 2024	Automated Theorem ProvingMath	CodeCode Available	4	5
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover	Jul 24, 2024	Automated Theorem ProvingMath	CodeCode Available	4	5
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers	Aug 12, 2024	GSM8KMath	CodeCode Available	4	5
Let's Verify Step by Step	May 31, 2023	Active LearningMath	CodeCode Available	4	5
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4	5
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4	5

Show:10 25 50

← PrevPage 1 of 32Next →

No leaderboard results yet.