SOTAVerified

Math

Papers

Showing 151175 of 1596 papers

TitleStatusHype
Memorizing TransformersCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Meta-Design Matters: A Self-Design Multi-Agent SystemCode2
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought ReasoningCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language ModelsCode2
Dynamic Early Exit in Reasoning ModelsCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language ModelsCode2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
Cumulative Reasoning with Large Language ModelsCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning ModelsCode2
Adaptable Logical Control for Large Language ModelsCode2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-SolvingCode2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuningCode2
Meta Prompting for AI SystemsCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
Show:102550
← PrevPage 7 of 64Next →

No leaderboard results yet.