| STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making | May 25, 2024 | Decision MakingMathematical Reasoning | CodeCode Available | 1 |
| VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks | May 24, 2024 | Mathematical ReasoningNatural Language Understanding | CodeCode Available | 1 |
| JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models | May 23, 2024 | Knowledge DistillationMath | CodeCode Available | 1 |
| Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning | May 22, 2024 | Mathematical ReasoningMultiple-choice | CodeCode Available | 1 |
| VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context | May 8, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |
| GOLD: Geometry Problem Solver with Natural Language Description | May 1, 2024 | Math | CodeCode Available | 1 |
| Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing | Apr 18, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models | Mar 4, 2024 | Data AugmentationGSM8K | CodeCode Available | 1 |
| Stepwise Self-Consistent Mathematical Reasoning with Large Language Models | Feb 24, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |
| ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models | Feb 22, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |