| Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection | Nov 13, 2024 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| Gap-Filling Prompting Enhances Code-Assisted Mathematical Reasoning | Nov 8, 2024 | Mathematical Reasoning | CodeCode Available | 0 |
| Benchmarking Large Language Models with Integer Sequence Generation Tasks | Nov 7, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Kwai-STaR: Transform LLMs into State-Transition Reasoners | Nov 7, 2024 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI | Nov 7, 2024 | Mathematical Reasoning | —Unverified | 0 |
| MoD: A Distribution-Based Approach for Merging Large Language Models | Nov 1, 2024 | Mathematical Reasoning | CodeCode Available | 0 |
| STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing | Nov 1, 2024 | 2kIn-Context Learning | —Unverified | 0 |
| VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning | Oct 30, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning | Oct 29, 2024 | Mathematical Reasoning | —Unverified | 0 |
| DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models | Oct 29, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Library Learning Doesn't: The Curious Case of the Single-Use "Library" | Oct 26, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| GFlowNet Fine-tuning for Diverse Correct Solutions in Mathematical Reasoning Tasks | Oct 26, 2024 | DiversityMathematical Reasoning | —Unverified | 0 |
| ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning | Oct 24, 2024 | GSM8KMath | —Unverified | 0 |
| SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning | Oct 24, 2024 | Knowledge DistillationMathematical Reasoning | CodeCode Available | 0 |
| Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks | Oct 24, 2024 | Logical ReasoningMathematical Problem-Solving | —Unverified | 0 |
| Markov Chain of Thought for Efficient Mathematical Reasoning | Oct 23, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Can Large Language Models Invent Algorithms to Improve Themselves? | Oct 21, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Keep Guessing? When Considering Inference Scaling, Mind the Baselines | Oct 20, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology | Oct 19, 2024 | Logical ReasoningMath | —Unverified | 0 |
| Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning | Oct 18, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs | Oct 17, 2024 | Mathematical Reasoning | —Unverified | 0 |
| AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning | Oct 17, 2024 | Mathematical ReasoningQuestion Answering | —Unverified | 0 |
| Enhancing Mathematical Reasoning in LLMs by Stepwise Correction | Oct 16, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning | Oct 16, 2024 | AllGSM8K | CodeCode Available | 0 |
| MIND: Math Informed syNthetic Dialogues for Pretraining LLMs | Oct 15, 2024 | GSM8KMath | —Unverified | 0 |