SOTAVerified

Math

Papers

Showing 126150 of 1596 papers

TitleStatusHype
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
An Expression Tree Decoding Strategy for Mathematical Equation GenerationCode2
Efficient Reinforcement Finetuning via Adaptive Curriculum LearningCode2
Meta Prompting for AI SystemsCode2
MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
Dynamic Early Exit in Reasoning ModelsCode2
Accelerating Sparse Deep Neural NetworksCode2
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language ModelsCode2
Advancing Language Model Reasoning through Reinforcement Learning and Inference ScalingCode2
Easy-to-Hard Generalization: Scalable Alignment Beyond Human SupervisionCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
MAS-Zero: Designing Multi-Agent Systems with Zero SupervisionCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
Can AI Assistants Know What They Don't Know?Code2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
AdaptThink: Reasoning Models Can Learn When to ThinkCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
Show:102550
← PrevPage 6 of 64Next →

No leaderboard results yet.