| CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets | Sep 29, 2023 | Language ModellingMathematical Reasoning | CodeCode Available | 2 |
| ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving | Sep 29, 2023 | Arithmetic ReasoningComputational Efficiency | CodeCode Available | 3 |
| LPML: LLM-Prompting Markup Language for Mathematical Reasoning | Sep 21, 2023 | Mathematical Reasoning | —Unverified | 0 |
| Code Soliloquies for Accurate Calculations in Large Language Models | Sep 21, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models | Sep 21, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| Auto-Regressive Next-Token Predictors are Universal Learners | Sep 13, 2023 | Mathematical ReasoningText Generation | CodeCode Available | 1 |
| MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Sep 11, 2023 | MathMathematical Reasoning | CodeCode Available | 2 |
| On the meaning of uncertainty for ethical AI: philosophy and practice | Sep 11, 2023 | Decision MakingMathematical Reasoning | —Unverified | 0 |
| No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function | Sep 1, 2023 | GSM8KMathematical Reasoning | —Unverified | 0 |
| When Do Program-of-Thoughts Work for Reasoning? | Aug 29, 2023 | Code GenerationMathematical Reasoning | CodeCode Available | 2 |
| Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch | Aug 23, 2023 | Mathematical Reasoning | CodeCode Available | 1 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| Probabilistic Results on the Architecture of Mathematical Reasoning Aligned by Cognitive Alternation | Aug 17, 2023 | Mathematical Reasoning | —Unverified | 0 |
| Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation | Aug 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Forward-Backward Reasoning in Large Language Models for Mathematical Verification | Aug 15, 2023 | Mathematical Reasoning | —Unverified | 0 |
| Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification | Aug 15, 2023 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Scaling Relationship on Learning Mathematical Reasoning with Large Language Models | Aug 3, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models | Aug 1, 2023 | In-Context LearningMath | —Unverified | 0 |
| FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios | Jul 25, 2023 | Code GenerationFact Checking | CodeCode Available | 2 |
| MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning | Jul 16, 2023 | Knowledge DistillationMathematical Reasoning | —Unverified | 0 |
| MWPRanker: An Expression Similarity Based Math Word Problem Retriever | Jul 3, 2023 | Logical SequenceMath | —Unverified | 0 |
| Math Word Problem Solving by Generating Linguistic Variants of Problem Statements | Jun 24, 2023 | DecoderIngenuity | CodeCode Available | 0 |
| JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving | Jun 19, 2023 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| Position: AI Evaluation Should Learn from How We Test Humans | Jun 18, 2023 | Mathematical ReasoningPosition | CodeCode Available | 0 |
| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination | Jun 10, 2023 | MathMathematical Reasoning | —Unverified | 0 |
| Turning large language models into cognitive models | Jun 6, 2023 | Decision MakingMathematical Reasoning | CodeCode Available | 1 |
| Random Feedback Alignment Algorithms to train Neural Networks: Why do they Align? | Jun 4, 2023 | Mathematical Reasoning | —Unverified | 0 |
| Evaluating Language Models for Mathematics through Interactions | Jun 2, 2023 | Language ModellingMathematical Problem-Solving | CodeCode Available | 1 |
| Learning Multi-Step Reasoning by Solving Arithmetic Tasks | Jun 2, 2023 | MathMathematical Reasoning | CodeCode Available | 1 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 |
| A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis | May 24, 2023 | Arithmetic ReasoningMathematical Reasoning | CodeCode Available | 1 |
| A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers | May 21, 2023 | Mathematical Reasoning | —Unverified | 0 |
| FedCBO: Reaching Group Consensus in Clustered Federated Learning through Consensus-based Optimization | May 4, 2023 | Federated Learningglobal-optimization | CodeCode Available | 1 |
| Federated Prompting and Chain-of-Thought Reasoning for Improving LLMs Answering | Apr 27, 2023 | Mathematical Reasoning | —Unverified | 0 |
| Self-Refine: Iterative Refinement with Self-Feedback | Mar 30, 2023 | Mathematical ReasoningResponse Generation | CodeCode Available | 3 |
| Natural Language Reasoning, A Survey | Mar 26, 2023 | Logical ReasoningMathematical Reasoning | CodeCode Available | 1 |
| Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Mar 22, 2023 | Arithmetic ReasoningMathematical Reasoning | CodeCode Available | 6 |
| MathPrompter: Mathematical Reasoning using Large Language Models | Mar 4, 2023 | Arithmetic ReasoningMath | CodeCode Available | 1 |
| A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram | Feb 22, 2023 | Geometry Problem SolvingMathematical Reasoning | CodeCode Available | 1 |
| ChatGPT for Robotics: Design Principles and Model Abilities | Feb 20, 2023 | Mathematical ReasoningPrompt Engineering | CodeCode Available | 4 |
| Tree-Based Representation and Generation of Natural and Mathematical Language | Feb 15, 2023 | MathMathematical Reasoning | CodeCode Available | 1 |
| Learning by Applying: A General Framework for Mathematical Reasoning via Enhancing Explicit Knowledge Learning | Feb 11, 2023 | DecoderMathematical Reasoning | —Unverified | 0 |
| Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting | Feb 9, 2023 | Mathematical ReasoningNatural Language Inference | CodeCode Available | 0 |
| Reliable Natural Language Understanding with Large Language Models and Answer Set Programming | Feb 7, 2023 | Mathematical ReasoningNatural Language Understanding | —Unverified | 0 |
| Techniques to Improve Neural Math Word Problem Solvers | Feb 6, 2023 | DecoderLanguage Modelling | CodeCode Available | 0 |
| Mathematical Capabilities of ChatGPT | Jan 31, 2023 | Elementary MathematicsMath | CodeCode Available | 1 |
| A Survey of Deep Learning for Mathematical Reasoning | Dec 20, 2022 | Deep LearningMath | CodeCode Available | 2 |
| Reasoning with Language Model Prompting: A Survey | Dec 19, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 3 |
| UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression | Dec 6, 2022 | Geometry Problem SolvingLogical Reasoning | CodeCode Available | 1 |