SOTAVerified

Math

Papers

Showing 426450 of 1596 papers

TitleStatusHype
Upweighting Easy Samples in Fine-Tuning Mitigates ForgettingCode0
Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference0
LIMO: Less is More for ReasoningCode5
Do Large Language Model Benchmarks Test Reliability?Code1
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model0
Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs0
Process Reinforcement through Implicit RewardsCode5
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo MethodsCode1
Blink of an eye: a simple theory for feature localization in generative models0
Learning Autonomous Code Integration for Math Language Models0
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?0
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language ModelsCode2
Fairshare Data Pricing via Data Valuation for Large Language Models0
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning0
s1: Simple test-time scalingCode9
Pheromone-based Learning of Optimal Reasoning Paths0
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data BoostrappingCode0
PixelWorld: Towards Perceiving Everything as Pixels0
Examining the Robustness of Large Language Models across Language Complexity0
Efficient Neural Theorem Proving via Fine-grained Proof Structure AnalysisCode1
Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH0
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving0
Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework0
Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning0
Show:102550
← PrevPage 18 of 64Next →

No leaderboard results yet.