| Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study | Mar 21, 2025 | AttributeMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Decomposing Elements of Problem Solving: What "Math" Does RL Teach? | May 28, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning | May 14, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Data Contamination Through the Lens of Time | Oct 16, 2023 | Mathematical Problem-Solving | CodeCode Available | 0 | 5 |
| SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving | Oct 19, 2023 | GSM8KMath | CodeCode Available | 0 | 5 |
| Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks | Apr 19, 2024 | Mathematical Problem-Solving | CodeCode Available | 0 | 5 |
| HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class | May 17, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Jun 18, 2024 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange | Mar 30, 2024 | MathMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| A Survey on Mathematical Reasoning and Optimization with Large Language Models | Mar 22, 2025 | Automated Theorem ProvingHeuristic Search | CodeCode Available | 0 | 5 |