SOTAVerified

Math

Papers

Showing 201250 of 1596 papers

TitleStatusHype
Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping0
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach0
Learning from Peers in Reasoning Models0
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem SolvingCode2
Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model ReasoningCode1
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models0
DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs0
xGen-small Technical Report0
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks0
Scalable LLM Math Reasoning Acceleration with Low-rank Distillation0
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers0
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning0
RM-R1: Reward Modeling as ReasoningCode2
Generating Narrated Lecture Videos from Slides with Synchronized Highlights0
Rewriting Pre-Training Data Boosts LLM Performance in Math and CodeCode1
A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law0
LookAlike: Consistent Distractor Generation in Math MCQs0
TutorGym: A Testbed for Evaluating AI Agents as Tutors and StudentsCode0
NeMo-Inspector: A Visualization Tool for LLM Generation AnalysisCode1
DeepCritic: Deliberate Critique with Large Language ModelsCode1
LLMs Do Not Have Human-Like Working Memory0
Phi-4-reasoning Technical Report0
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math0
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models0
Reinforcement Learning for Reasoning in Large Language Models with One Training ExampleCode3
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition0
Local Prompt Optimization0
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets0
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries0
Efficient Reasoning for LLMs through Speculative Chain-of-ThoughtCode1
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics0
An Empirical Study on Prompt Compression for Large Language ModelsCode3
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual DependencyCode1
Training Large Language Models to Reason via EM Policy Gradient0
SplitReason: Learning To Offload Reasoning0
Process Reward Models That ThinkCode2
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning datasetCode4
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language ModelsCode0
Dynamic Early Exit in Reasoning ModelsCode2
TTRL: Test-Time Reinforcement LearningCode7
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception0
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling EvaluatorsCode0
OTC: Optimal Tool Calls via Reinforcement Learning0
Learning to Reason under Off-Policy GuidanceCode3
Roll the dice & look before you leap: Going beyond the creative limits of next-token predictionCode2
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for ReasoningCode2
Enhancing Math Learning in an LMS Using AI-Driven Question Recommendations0
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?0
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models0
MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection0
Show:102550
← PrevPage 5 of 32Next →

No leaderboard results yet.