SOTAVerified

Math

Papers

Showing 151200 of 1596 papers

TitleStatusHype
AdaptThink: Reasoning Models Can Learn When to ThinkCode2
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language ModelsCode2
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning ModelsCode2
Easy-to-Hard Generalization: Scalable Alignment Beyond Human SupervisionCode2
Autoformalizing Euclidean GeometryCode2
MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesCode2
Multi-View Reasoning: Consistent Contrastive Learning for Math Word ProblemCode2
OctoThinker: Mid-training Incentivizes Reinforcement Learning ScalingCode2
Meta Prompting for AI SystemsCode2
Meta-Design Matters: A Self-Design Multi-Agent SystemCode2
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language ModelsCode2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought ReasoningCode2
Measuring Multimodal Mathematical Reasoning with MATH-Vision DatasetCode2
FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning ModelsCode2
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision ModelsCode2
Dynamic Early Exit in Reasoning ModelsCode2
MegaMath: Pushing the Limits of Open Math CorporaCode2
Memorizing TransformersCode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math ReasoningCode2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language ModelsCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks AutomationCode2
Adaptable Logical Control for Large Language ModelsCode2
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math DataCode2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-SolvingCode2
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Cumulative Reasoning with Large Language ModelsCode2
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics BenchmarkCode2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language ModelsCode2
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning ProcessCode2
Essential-Web v1.0: 24T tokens of organized web dataCode2
Play to Generalize: Learning to Reason Through Game PlayCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
Exploring the Compositional Deficiency of Large Language Models in Mathematical ReasoningCode2
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning ModelsCode2
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuningCode2
A Survey of Deep Learning for Mathematical ReasoningCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
MAS-Zero: Designing Multi-Agent Systems with Zero SupervisionCode2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of ParametersCode2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
Flaming-hot Initiation with Regular Execution Sampling for Large Language ModelsCode2
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics LearningCode2
Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language ModelsCode2
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
Show:102550
← PrevPage 4 of 32Next →

No leaderboard results yet.