SOTAVerified

Math

Papers

Showing 651700 of 1596 papers

TitleStatusHype
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks0
Scalable LLM Math Reasoning Acceleration with Low-rank Distillation0
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers0
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning0
A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law0
Generating Narrated Lecture Videos from Slides with Synchronized Highlights0
LookAlike: Consistent Distractor Generation in Math MCQs0
TutorGym: A Testbed for Evaluating AI Agents as Tutors and StudentsCode0
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math0
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models0
Phi-4-reasoning Technical Report0
LLMs Do Not Have Human-Like Working Memory0
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition0
Local Prompt Optimization0
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets0
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics0
Training Large Language Models to Reason via EM Policy Gradient0
SplitReason: Learning To Offload Reasoning0
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models0
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling EvaluatorsCode0
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception0
OTC: Optimal Tool Calls via Reinforcement Learning0
Enhancing Math Learning in an LMS Using AI-Driven Question Recommendations0
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?0
MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection0
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models0
In between myth and reality: AI for math -- a case study in category theory0
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation0
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading0
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs0
Heimdall: test-time scaling on the generative verification0
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable0
Supervised Optimism Correction: Be Confident When LLMs Are Sure0
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning0
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification0
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use0
Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning0
oneDAL Optimization for ARM Scalable Vector Extension: Maximizing Efficiency for High-Performance Data Science0
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning0
Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation0
Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models0
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics0
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study0
Hawkeye:Efficient Reasoning with Model Collaboration0
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning0
Investigating Large Language Models in Diagnosing Students' Cognitive Skills in Math Problem-solving0
An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU functionCode0
DebFlow: Automating Agent Creation via Agent Debate0
Effective Skill Unlearning through Intervention and AbstentionCode0
Show:102550
← PrevPage 14 of 32Next →

No leaderboard results yet.