| Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement | May 23, 2023 | GSM8K | CodeCode Available | 1 |
| Solving Math Word Problems by Combining Language Models With Symbolic Solvers | Apr 16, 2023 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Boosted Prompt Ensembles for Large Language Models | Apr 12, 2023 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning | Jan 27, 2023 | Few-Shot LearningGSM8K | CodeCode Available | 1 |
| Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions | May 28, 2022 | Arithmetic ReasoningEfficient Exploration | CodeCode Available | 1 |
| Self-Consistency Improves Chain of Thought Reasoning in Language Models | Mar 21, 2022 | ARCArithmetic Reasoning | CodeCode Available | 1 |
| GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems | Jul 17, 2025 | DiversityGSM8K | —Unverified | 0 |
| DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression | Jul 16, 2025 | GSM8K | CodeCode Available | 0 |
| KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning? | Jul 15, 2025 | GSM8KLanguage Modeling | —Unverified | 0 |
| CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs | Jul 8, 2025 | GSM8KMath | —Unverified | 0 |