SOTAVerified

Math

Papers

Showing 151175 of 1596 papers

TitleStatusHype
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
Meta-Design Matters: A Self-Design Multi-Agent SystemCode2
Meta Prompting for AI SystemsCode2
MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique PipelineCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic CorpusCode2
Memorizing TransformersCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
Easy-to-Hard Generalization: Scalable Alignment Beyond Human SupervisionCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
Essential-Web v1.0: 24T tokens of organized web dataCode2
Adaptable Logical Control for Large Language ModelsCode2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
Show:102550
← PrevPage 7 of 64Next →

No leaderboard results yet.