SOTAVerified

Math

Papers

Showing 101150 of 1596 papers

TitleStatusHype
Discriminative Policy Optimization for Token-Level Reward ModelsCode0
DINGO: Constrained Inference for Diffusion LLMs0
LLM Performance for Code Generation on Noisy TasksCode0
Decomposing Elements of Problem Solving: What "Math" Does RL Teach?Code0
ASyMOB: Algebraic Symbolic Mathematical Operations BenchmarkCode0
Maximizing Confidence Alone Improves Reasoning0
Skywork Open Reasoner 1 Technical ReportCode4
Advancing Multimodal Reasoning via Reinforcement Learning with Cold StartCode1
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPOCode2
ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained KnowledgeCode1
Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning0
Reinforcing General Reasoning without VerifiersCode2
REAL-Prover: Retrieval Augmented Lean Prover for Mathematical ReasoningCode1
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token RoutingCode2
MAS-Zero: Designing Multi-Agent Systems with Zero SupervisionCode2
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions0
Unifying Multimodal Large Language Model Capabilities and Modalities via Model MergingCode1
Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition0
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal ModelsCode0
Faster and Better LLMs via Latency-Aware Test-Time Scaling0
The Role of Diversity in In-Context Learning for Large Language Models0
Interleaved Reasoning for Large Language Models via Reinforcement Learning0
Improving Multilingual Math Reasoning for African Languages0
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning0
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical SupervisionCode0
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles0
Inference-time Alignment in Continuous SpaceCode0
AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models0
MMATH: A Multilingual Benchmark for Mathematical ReasoningCode0
Steering LLM Reasoning Through Bias-Only Adaptation0
Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math CompetitionsCode0
Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?0
MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors0
On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization0
Anchored Diffusion Language Model0
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled BenchmarkCode0
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models0
Decoupled Visual Interpretation and Linguistic Reasoning for Math Problem SolvingCode1
VideoGameBench: Can Vision-Language Models complete popular video games?0
One RL to See Them All: Visual Triple Unified Reinforcement Learning0
Value-Guided Search for Efficient Chain-of-Thought ReasoningCode1
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement LearningCode1
Outcome-based Reinforcement Learning to Predict the Future0
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs0
RaDeR: Reasoning-aware Dense Retrieval ModelsCode1
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning ModelsCode0
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning0
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement LearningCode2
Incremental Sequence Classification with Temporal Consistency0
Veracity Bias and Beyond: Uncovering LLMs' Hidden Beliefs in Problem-Solving Reasoning0
Show:102550
← PrevPage 3 of 32Next →

No leaderboard results yet.