| Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt | May 29, 2025 | Mathematical Reasoning | —Unverified | 0 |
| MathArena: Evaluating LLMs on Uncontaminated Math Competitions | May 29, 2025 | MathMathematical Reasoning | CodeCode Available | 3 |
| Probability-Consistent Preference Optimization for Enhanced LLM Reasoning | May 29, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness | May 29, 2025 | DiversityLarge Language Model | —Unverified | 0 |
| Diversity-Aware Policy Optimization for Large Language Model Reasoning | May 29, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | May 29, 2025 | Automated Theorem ProvingMathematical Reasoning | CodeCode Available | 1 |
| Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability | May 29, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| On-Policy RL with Optimal Reward Baseline | May 29, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning | May 29, 2025 | Geometry Problem SolvingMathematical Reasoning | —Unverified | 0 |
| Decomposing Elements of Problem Solving: What "Math" Does RL Teach? | May 28, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 |