| CoinMath: Harnessing the Power of Coding Instruction for Math LLMs | Dec 16, 2024 | DescriptiveMath | CodeCode Available | 0 |
| Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark | Oct 6, 2024 | Mathematical ReasoningSpatial Reasoning | CodeCode Available | 0 |
| Planning and Editing What You Retrieve for Enhanced Tool Learning | Mar 30, 2024 | Mathematical ReasoningRetrieval | CodeCode Available | 0 |
| Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement | Feb 18, 2024 | Mathematical ReasoningText Generation | CodeCode Available | 0 |
| Code Soliloquies for Accurate Calculations in Large Language Models | Sep 21, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models | Apr 17, 2024 | FormLanguage Model Evaluation | CodeCode Available | 0 |
| Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic | Nov 3, 2022 | Arithmetic ReasoningLanguage Modeling | CodeCode Available | 0 |
| Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs | Jun 11, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Table Question Answering for Low-resourced Indic Languages | Oct 4, 2024 | Cross-Lingual TransferMathematical Reasoning | CodeCode Available | 0 |
| Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction | Jun 2, 2024 | Mathematical Reasoning | CodeCode Available | 0 |