SOTAVerified

Math

Papers

Showing 176200 of 1596 papers

TitleStatusHype
Essential-Web v1.0: 24T tokens of organized web dataCode2
Adaptable Logical Control for Large Language ModelsCode2
Evaluating Mathematical Reasoning Beyond AccuracyCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
Can AI Assistants Know What They Don't Know?Code2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
MathPile: A Billion-Token-Scale Pretraining Corpus for MathCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
MAS-Zero: Designing Multi-Agent Systems with Zero SupervisionCode2
A Survey of Deep Learning for Mathematical ReasoningCode2
ProcessBench: Identifying Process Errors in Mathematical ReasoningCode2
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic CorpusCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language ModelsCode2
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
Cumulative Reasoning with Large Language ModelsCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
Show:102550
← PrevPage 8 of 64Next →

No leaderboard results yet.