| Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution | Apr 13, 2025 | GSM8KMath | CodeCode Available | 3 |
| VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning | Apr 10, 2025 | MathMultimodal Reasoning | CodeCode Available | 2 |
| Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression | Apr 10, 2025 | MathMMLU | CodeCode Available | 1 |
| Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory | Apr 10, 2025 | MathMMLU | CodeCode Available | 3 |
| Supervised Optimism Correction: Be Confident When LLMs Are Sure | Apr 10, 2025 | GSM8KMath | —Unverified | 0 |
| GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable | Apr 10, 2025 | GPUMath | —Unverified | 0 |
| MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning | Apr 9, 2025 | Code GenerationDiversity | —Unverified | 0 |
| Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization | Apr 8, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models | Apr 8, 2025 | MathMultimodal Reasoning | CodeCode Available | 1 |
| Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models | Apr 7, 2025 | Dialogue EvaluationFairness | CodeCode Available | 2 |