| Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist | Jul 11, 2024 | GSM8KMath | —Unverified | 0 |
| Iterative Reasoning Preference Optimization | Apr 30, 2024 | ARCGSM8K | —Unverified | 0 |
| Yi-Lightning Technical Report | Dec 2, 2024 | ChatbotLarge Language Model | —Unverified | 0 |
| Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models | Jun 16, 2025 | Mathreinforcement-learning | —Unverified | 0 |
| JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation | Oct 22, 2024 | Math | —Unverified | 0 |
| Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning | Oct 8, 2024 | Image RetrievalMath | —Unverified | 0 |
| Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking | Mar 25, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data | Dec 20, 2018 | ClusteringMath | —Unverified | 0 |
| Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning | Mar 4, 2024 | GSM8KMath | —Unverified | 0 |
| Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities | May 21, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |