SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
How is ChatGPT's behavior changing over time?	Jul 18, 2023	Code GenerationLanguage Modelling	CodeCode Available	4	5
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4	5
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond	Mar 13, 2025	Domain GeneralizationMath	CodeCode Available	4	5
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction	Feb 11, 2025	Code GenerationMath	CodeCode Available	4	5
ReFT: Reasoning with Reinforced Fine-Tuning	Jan 17, 2024	GSM8KMath	CodeCode Available	4	5
PAL: Program-aided Language Models	Nov 18, 2022	Arithmetic ReasoningGSM8K	CodeCode Available	3	5
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	Feb 10, 2025	Math	CodeCode Available	3	5
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	Aug 1, 2024	MathMM-Vet	CodeCode Available	3	5
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	May 1, 2024	ARCGSM8K	CodeCode Available	3	5
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3	5

Show:10 25 50

← PrevPage 7 of 160Next →

No leaderboard results yet.