SOTAVerified

Math

Papers

Showing 151175 of 1596 papers

TitleStatusHype
Memorizing TransformersCode2
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision ModelsCode2
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought ReasoningCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language ModelsCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language ModelsCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
Cumulative Reasoning with Large Language ModelsCode2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-SolvingCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Meta-Design Matters: A Self-Design Multi-Agent SystemCode2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
Adaptable Logical Control for Large Language ModelsCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuningCode2
Show:102550
← PrevPage 7 of 64Next →

No leaderboard results yet.