SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 1596 papers

Title	Date	Tasks	Status	Hype
Leveraging LLMs to Assess Tutor Moves in Real-Life Dialogues: A Feasibility Study	Jun 20, 2025	Math	—Unverified	0
No Free Lunch: Rethinking Internal Feedback for LLM Reasoning	Jun 20, 2025	Mathreinforcement-learning	—Unverified	0
OJBench: A Competition Level Code Benchmark For Large Language Models	Jun 19, 2025	Math	CodeCode Available	1
AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System Need	Jun 18, 2025	GSM8KHumanEval	CodeCode Available	0
Utility-Driven Speculative Decoding for Mixture-of-Experts	Jun 17, 2025	GPULarge Language Model	—Unverified	0
Essential-Web v1.0: 24T tokens of organized web data	Jun 17, 2025	Math	CodeCode Available	2
SIRI-Bench: Challenging VLMs' Spatial Intelligence through Complex Reasoning Tasks	Jun 17, 2025	MathSpatial Reasoning	—Unverified	0
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team	Jun 17, 2025	Code GenerationGSM8K	CodeCode Available	1
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy	Jun 16, 2025	MathReinforcement Learning (RL)	—Unverified	0
Direct Reasoning Optimization: LLMs Can Reward And Refine Their Own Reasoning for Open-Ended Tasks	Jun 16, 2025	FormMath	—Unverified	0

Show:10 25 50

← PrevPage 4 of 160Next →

No leaderboard results yet.