| MathAttack: Attacking Large Language Models Towards Math Solving Ability | Sep 4, 2023 | Adversarial AttackGSM8K | —Unverified | 0 |
| No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function | Sep 1, 2023 | GSM8KMathematical Reasoning | —Unverified | 0 |
| AskIt: Unified Programming Interface for Programming with Large Language Models | Aug 29, 2023 | Code GenerationFew-Shot Learning | CodeCode Available | 1 |
| Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning | Aug 21, 2023 | GSM8K | CodeCode Available | 0 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| Scaling Relationship on Learning Mathematical Reasoning with Large Language Models | Aug 3, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning | Aug 1, 2023 | GSM8KMath | CodeCode Available | 1 |
| A mixed policy to improve performance of language models on math problems | Jul 17, 2023 | GSM8KMath | CodeCode Available | 0 |
| DiversiGATE: A Comprehensive Framework for Reliable Large Language Models | Jun 22, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Interpretable Math Word Problem Solution Generation Via Step-by-step Planning | Jun 1, 2023 | GSM8KLanguage Modeling | —Unverified | 0 |
| Matrix Information Theory for Self-Supervised Learning | May 27, 2023 | Contrastive LearningGSM8K | CodeCode Available | 1 |
| Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models | May 26, 2023 | GSM8KMultimodal Reasoning | CodeCode Available | 3 |
| GRACE: Discriminator-Guided Chain-of-Thought Reasoning | May 24, 2023 | GSM8KMath | CodeCode Available | 1 |
| Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems | May 24, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 0 |
| Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement | May 23, 2023 | GSM8K | CodeCode Available | 1 |
| PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 0 |
| Automatic Model Selection with Large Language Models for Reasoning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought | May 19, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs | May 19, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Self-Evaluation Guided Beam Search for Reasoning | May 1, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Progressive-Hint Prompting Improves Reasoning in Large Language Models | Apr 19, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| Solving Math Word Problems by Combining Language Models With Symbolic Solvers | Apr 16, 2023 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Boosted Prompt Ensembles for Large Language Models | Apr 12, 2023 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning | Jan 27, 2023 | Few-Shot LearningGSM8K | CodeCode Available | 1 |
| Teaching Small Language Models to Reason | Dec 16, 2022 | GSM8KKnowledge Distillation | —Unverified | 0 |