SOTAVerified

Math

Papers

Showing 126150 of 1596 papers

TitleStatusHype
MegaMath: Pushing the Limits of Open Math CorporaCode2
Meta-Design Matters: A Self-Design Multi-Agent SystemCode2
An Expression Tree Decoding Strategy for Mathematical Equation GenerationCode2
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought ReasoningCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
Accelerating Sparse Deep Neural NetworksCode2
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language ModelsCode2
Advancing Language Model Reasoning through Reinforcement Learning and Inference ScalingCode2
Easy-to-Hard Generalization: Scalable Alignment Beyond Human SupervisionCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
MAS-Zero: Designing Multi-Agent Systems with Zero SupervisionCode2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-SolvingCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
Cumulative Reasoning with Large Language ModelsCode2
AdaptThink: Reasoning Models Can Learn When to ThinkCode2
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language ModelsCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
Show:102550
← PrevPage 6 of 64Next →

No leaderboard results yet.