| An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning | Feb 23, 2024 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 2 |
| Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation | Feb 21, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning | Feb 20, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering | Feb 17, 2024 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 |
| Orca-Math: Unlocking the potential of SLMs in Grade School Math | Feb 16, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Feb 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 4 |
| The Unreasonable Effectiveness of Eccentric Automatic Prompts | Feb 9, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Exploring Group and Symmetry Principles in Large Language Models | Feb 9, 2024 | Arithmetic ReasoningNegation | —Unverified | 0 |
| DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models | Feb 5, 2024 | Arithmetic ReasoningMath | CodeCode Available | 9 |
| Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting | Jan 28, 2024 | Arithmetic ReasoningFact Checking | —Unverified | 0 |