| Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving | Jun 2, 2021 | Math | CodeCode Available | 0 |
| Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models | Oct 10, 2024 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking | Jun 1, 2025 | 4kMath | CodeCode Available | 0 |
| Guided Speculative Inference for Efficient Test-Time Alignment of LLMs | Jun 4, 2025 | Math | CodeCode Available | 0 |
| Can Vision-Language Models Evaluate Handwritten Math? | Jan 13, 2025 | Math | CodeCode Available | 0 |
| Adversarial Examples for Evaluating Math Word Problem Solvers | Sep 13, 2021 | Adversarial RobustnessMath | CodeCode Available | 0 |
| Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks? | Oct 27, 2024 | Data AugmentationMath | CodeCode Available | 0 |
| Effective Skill Unlearning through Intervention and Abstention | Mar 27, 2025 | General KnowledgeMath | CodeCode Available | 0 |
| Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning | Oct 16, 2024 | AllGSM8K | CodeCode Available | 0 |
| HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class | May 17, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 |