| Implicit Reasoning in Transformers is Reasoning through Shortcuts | Mar 10, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables | Jun 29, 2024 | Fact VerificationMathematical Reasoning | CodeCode Available | 1 | 5 |
| IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning | Oct 25, 2021 | Arithmetic ReasoningMathematical Question Answering | CodeCode Available | 1 | 5 |
| MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer | Mar 19, 2025 | Answer GenerationMathematical Reasoning | CodeCode Available | 1 | 5 |
| Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems | Oct 24, 2024 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models | Oct 7, 2024 | GSM8KLogical Reasoning | CodeCode Available | 1 | 5 |
| Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations | Dec 14, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 | 5 |
| GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models | Apr 13, 2025 | Mathematical Reasoning | CodeCode Available | 1 | 5 |
| Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning | May 30, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |