| Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models | Feb 20, 2024 | Mathematical Reasoning | CodeCode Available | 1 |
| Reformatted Alignment | Feb 19, 2024 | GSM8KHallucination | CodeCode Available | 2 |
| Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents | Feb 18, 2024 | Mathematical ReasoningMulti-hop Question Answering | CodeCode Available | 1 |
| Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement | Feb 18, 2024 | Mathematical ReasoningText Generation | CodeCode Available | 0 |
| Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering | Feb 17, 2024 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 |
| When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Feb 16, 2024 | Mathematical ReasoningRe-Ranking | CodeCode Available | 2 |
| Reasoning over Uncertain Text by Generative Large Language Models | Feb 14, 2024 | Decision MakingMathematical Reasoning | CodeCode Available | 0 |
| MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data | Feb 14, 2024 | Automated Theorem ProvingLanguage Modelling | CodeCode Available | 1 |
| Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs | Feb 12, 2024 | 2kMathematical Reasoning | —Unverified | 0 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 |
| Can Graph Descriptive Order Affect Solving Graph Problems with LLMs? | Feb 11, 2024 | DescriptiveLanguage Modelling | —Unverified | 0 |
| Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models | Feb 6, 2024 | Mathematical ReasoningVariable Selection | —Unverified | 0 |
| DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models | Feb 5, 2024 | Arithmetic ReasoningMath | CodeCode Available | 9 |
| Large Language Models for Mathematical Reasoning: Progresses and Challenges | Jan 31, 2024 | DiversityMath | —Unverified | 0 |
| Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless Systems | Jan 30, 2024 | Mathematical ReasoningRAG | —Unverified | 0 |
| Efficient Tool Use with Chain-of-Abstraction Reasoning | Jan 30, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| GAPS: Geometry-Aware Problem Solver | Jan 29, 2024 | Geometry Problem SolvingMath | —Unverified | 0 |
| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 |
| Demystifying Chains, Trees, and Graphs of Thoughts | Jan 25, 2024 | Mathematical ReasoningPrompt Engineering | —Unverified | 0 |
| Distilling Mathematical Reasoning Capabilities into Small Language Models | Jan 22, 2024 | Mathematical Reasoning | —Unverified | 0 |
| SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese | Jan 22, 2024 | DiversityGSM8K | CodeCode Available | 2 |
| LangBridge: Multilingual Reasoning Without Multilingual Supervision | Jan 19, 2024 | Code CompletionLogical Reasoning | CodeCode Available | 2 |
| Knowledge Fusion of Large Language Models | Jan 19, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions | Jan 17, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| Augmenting Math Word Problems via Iterative Question Composing | Jan 17, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |