SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–410 of 1596 papers

Title	Date	Tasks	Status	Hype
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL	Feb 17, 2025	Code GenerationMath	CodeCode Available	1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping	Feb 16, 2025	Code GenerationInstruction Following	CodeCode Available	1
Graders should cheat: privileged information enables expert-level automated evaluations	Feb 16, 2025	Math	—Unverified	0
Dyve: Thinking Fast and Slow for Dynamic Process Verification	Feb 16, 2025	Math	CodeCode Available	1
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls	Feb 16, 2025	Computational EfficiencyGSM8K	CodeCode Available	0
1bit-Merging: Dynamic Quantized Merging for Large Language Models	Feb 15, 2025	Code GenerationMath	—Unverified	0
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	Feb 13, 2025	BenchmarkingMath	—Unverified	0
CRANE: Reasoning with constrained LLM generation	Feb 13, 2025	Code GenerationMath	—Unverified	0
Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving	Feb 12, 2025	Mathmultimodal interaction	—Unverified	0
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges	Feb 12, 2025	GSM8KMath	CodeCode Available	0

Show:10 25 50

← PrevPage 41 of 160Next →

No leaderboard results yet.