SOTAVerified

Math

Papers

Showing 251275 of 1596 papers

TitleStatusHype
In between myth and reality: AI for math -- a case study in category theory0
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading0
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation0
Reinforcement Learning from Human FeedbackCode5
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs0
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit GenerationCode1
M1: Towards Scalable Test-Time Compute with Mamba Reasoning ModelsCode1
Heimdall: test-time scaling on the generative verification0
Efficient Process Reward Model Training via Active LearningCode1
The Jailbreak Tax: How Useful are Your Jailbreak Outputs?Code1
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free ResolutionCode3
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement LearningCode2
Dynamic Cheatsheet: Test-Time Learning with Adaptive MemoryCode3
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for CompressionCode1
Supervised Optimism Correction: Be Confident When LLMs Are Sure0
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable0
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning0
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning IncentivizationCode2
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language ModelsCode1
Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language ModelsCode2
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use0
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning ModelsCode2
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification0
Efficient Reinforcement Finetuning via Adaptive Curriculum LearningCode2
Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning0
Show:102550
← PrevPage 11 of 64Next →

No leaderboard results yet.