| CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities | Jan 13, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Channel Merging: Preserving Specialization for Merged Experts | Dec 18, 2024 | Code GenerationGPU | —Unverified | 0 | 0 |
| CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning | Mar 24, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning | Jan 23, 2025 | AttributeMathematical Reasoning | —Unverified | 0 | 0 |
| CodeGemma: Open Code Models Based on Gemma | Jun 17, 2024 | Code CompletionMathematical Reasoning | —Unverified | 0 | 0 |
| CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning | Oct 3, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Composing Ensembles of Pre-trained Models via Iterative Consensus | Oct 20, 2022 | Arithmetic ReasoningImage Generation | —Unverified | 0 | 0 |
| Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting | Aug 18, 2024 | HumanEvalMathematical Reasoning | —Unverified | 0 | 0 |
| Conjectures, Tests and Proofs: An Overview of Theory Exploration | Sep 7, 2021 | Automated Theorem ProvingMathematical Reasoning | —Unverified | 0 | 0 |
| ControlMath: Controllable Data Generation Promotes Math Generalist Models | Sep 20, 2024 | Data AugmentationDiversity | —Unverified | 0 | 0 |