| Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving | May 12, 2025 | MathMathematical Problem-Solving | CodeCode Available | 2 |
| Reasoning Models Can Be Effective Without Thinking | Apr 14, 2025 | Automated Theorem ProvingMathematical Problem-Solving | —Unverified | 0 |
| Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models | Apr 9, 2025 | Instruction FollowingMathematical Problem-Solving | —Unverified | 0 |
| LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models | Apr 3, 2025 | Mathematical Problem-SolvingPrompt Engineering | —Unverified | 0 |
| On Vanishing Variance in Transformer Length Generalization | Apr 3, 2025 | AttributeMathematical Problem-Solving | —Unverified | 0 |
| Exploring LLM Reasoning Through Controlled Prompt Variations | Apr 2, 2025 | GSM8KMathematical Problem-Solving | CodeCode Available | 0 |
| Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics | Apr 1, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Entropy-Based Adaptive Weighting for Self-Training | Mar 31, 2025 | GSM8KMath | CodeCode Available | 1 |
| MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection | Mar 23, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| A Survey on Mathematical Reasoning and Optimization with Large Language Models | Mar 22, 2025 | Automated Theorem ProvingHeuristic Search | CodeCode Available | 0 |