| GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Jun 18, 2024 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 |
| Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning | Jun 16, 2024 | BenchmarkingMath | —Unverified | 0 |
| 3D-Properties: Identifying Challenges in DPO and Charting a Path Forward | Jun 11, 2024 | Instruction FollowingMathematical Problem-Solving | —Unverified | 0 |
| OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step | Jun 4, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions | May 29, 2024 | BenchmarkingDialogue Understanding | CodeCode Available | 1 |
| The Buffer Mechanism for Multi-Step Information Reasoning in Language Models | May 24, 2024 | Mathematical Problem-Solving | —Unverified | 0 |
| Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving | May 20, 2024 | GSM8KMath | —Unverified | 0 |
| Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions | Apr 29, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks | Apr 23, 2024 | Mathematical Problem-SolvingQuestion Answering | CodeCode Available | 1 |
| Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks | Apr 19, 2024 | Mathematical Problem-Solving | CodeCode Available | 0 |
| ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline | Apr 3, 2024 | MathMathematical Problem-Solving | CodeCode Available | 2 |
| Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange | Mar 30, 2024 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models | Mar 26, 2024 | Code CompletionFew-Shot Learning | CodeCode Available | 3 |
| SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models | Mar 12, 2024 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| Premise Order Matters in Reasoning with Large Language Models | Feb 14, 2024 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| Large Language Models for Mathematical Reasoning: Progresses and Challenges | Jan 31, 2024 | DiversityMath | —Unverified | 0 |
| G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model | Dec 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning | Oct 20, 2023 | Mathematical Problem-SolvingPosition | —Unverified | 0 |
| SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving | Oct 19, 2023 | GSM8KMath | CodeCode Available | 0 |
| Data Contamination Through the Lens of Time | Oct 16, 2023 | Mathematical Problem-Solving | CodeCode Available | 0 |
| The Consensus Game: Language Model Generation via Equilibrium Search | Oct 13, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving | Sep 29, 2023 | Arithmetic ReasoningComputational Efficiency | CodeCode Available | 3 |
| Beyond Traditional Teaching: The Potential of Large Language Models and Chatbots in Graduate Engineering Education | Sep 9, 2023 | ChatbotMathematical Problem-Solving | —Unverified | 0 |
| Bayesian artificial brain with ChatGPT | Aug 28, 2023 | Mathematical Problem-Solving | —Unverified | 0 |
| JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving | Jun 19, 2023 | In-Context LearningLanguage Modeling | —Unverified | 0 |