| Evaluation of LLMs for mathematical problem solving | May 30, 2025 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation | May 30, 2025 | Code GenerationDiversity | CodeCode Available | 0 |
| Scaling up the think-aloud method | May 29, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability | May 29, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| On-Policy RL with Optimal Reward Baseline | May 29, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| Probability-Consistent Preference Optimization for Enhanced LLM Reasoning | May 29, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness | May 29, 2025 | DiversityLarge Language Model | —Unverified | 0 |
| Diversity-Aware Policy Optimization for Large Language Model Reasoning | May 29, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| Discriminative Policy Optimization for Token-Level Reward Models | May 29, 2025 | GSM8KLanguage Modeling | CodeCode Available | 0 |
| Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt | May 29, 2025 | Mathematical Reasoning | —Unverified | 0 |