| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 |
| TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action | Dec 7, 2024 | Depth EstimationMathematical Reasoning | CodeCode Available | 2 |
| Neuro-Symbolic Data Generation for Math Reasoning | Dec 6, 2024 | DiversityMath | —Unverified | 0 |
| Evolutionary Pre-Prompt Optimization for Mathematical Reasoning | Dec 5, 2024 | Few-Shot LearningGSM8K | —Unverified | 0 |
| Enhancing Mathematical Reasoning in LLMs with Background Operators | Dec 5, 2024 | Data AugmentationMath | —Unverified | 0 |
| Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Dec 4, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents | Dec 1, 2024 | Mathematical ReasoningMMLU | —Unverified | 0 |
| Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning | Nov 29, 2024 | Mathematical Reasoning | CodeCode Available | 2 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 |
| MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications | Nov 28, 2024 | document understandingMathematical Reasoning | —Unverified | 0 |