| Scaling Relationship on Learning Mathematical Reasoning with Large Language Models | Aug 3, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| Multiple-Choice Questions are Efficient and Robust LLM Evaluators | May 20, 2024 | GSM8KHumanEval | CodeCode Available | 1 |
| Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations | Dec 14, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models | Mar 4, 2024 | Data AugmentationGSM8K | CodeCode Available | 1 |
| Automatic Model Selection with Large Language Models for Reasoning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| CommVQ: Commutative Vector Quantization for KV Cache Compression | Jun 23, 2025 | GPUGSM8K | CodeCode Available | 1 |
| Entropy-Based Adaptive Weighting for Self-Training | Mar 31, 2025 | GSM8KMath | CodeCode Available | 1 |
| Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning | Oct 8, 2024 | GSM8KMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization | Oct 27, 2024 | GSM8KHellaSwag | CodeCode Available | 1 |