SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 1596 papers

Title	Date	Tasks	Status	Hype
CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs	Jul 8, 2025	GSM8KMath	—Unverified	0
Activation Steering for Chain-of-Thought Compression	Jul 7, 2025	GSM8KMath	CodeCode Available	0
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language Models	Jul 5, 2025	BenchmarkingGPU	CodeCode Available	1
EvoAgentX: An Automated Framework for Evolving Agentic Workflows	Jul 4, 2025	Code GenerationMath	CodeCode Available	7
Effects of structure on reasoning in instance-level Self-Discover	Jul 4, 2025	Math	CodeCode Available	0
Energy-Based Transformers are Scalable Learners and Thinkers	Jul 2, 2025	DenoisingImage Denoising	VerifiedCommunity Verified — 1 reproduction	5
Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model	Jun 30, 2025	Math	—Unverified	0
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning	Jun 30, 2025	MathMulti-agent Reinforcement Learning	CodeCode Available	2
Bridging Offline and Online Reinforcement Learning for LLMs	Jun 26, 2025	Instruction FollowingMath	—Unverified	0
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test	Jun 26, 2025	Code GenerationLarge Language Model	—Unverified	0

Show:10 25 50

← PrevPage 2 of 160Next →

No leaderboard results yet.