SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 671–680 of 1596 papers

Title	Date	Tasks	Status	Hype
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators	Apr 21, 2025	Code GenerationInstruction Following	CodeCode Available	0
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception	Apr 21, 2025	MathMMLU	—Unverified	0
OTC: Optimal Tool Calls via Reinforcement Learning	Apr 21, 2025	Mathreinforcement-learning	—Unverified	0
Enhancing Math Learning in an LMS Using AI-Driven Question Recommendations	Apr 18, 2025	ManagementMath	—Unverified	0
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?	Apr 18, 2025	MathVisual Reasoning	—Unverified	0
In between myth and reality: AI for math -- a case study in category theory	Apr 17, 2025	Math	—Unverified	0
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models	Apr 17, 2025	BenchmarkingMath	—Unverified	0
MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection	Apr 17, 2025	Anomaly DetectionData Augmentation	—Unverified	0
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation	Apr 16, 2025	GSM8KMath	—Unverified	0
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading	Apr 16, 2025	2kCode Generation	—Unverified	0

Show:10 25 50

← PrevPage 68 of 160Next →

No leaderboard results yet.