| Mars-PO: Multi-Agent Reasoning System Preference Optimization | Nov 28, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS | Nov 27, 2024 | In-Context LearningMath | CodeCode Available | 0 |
| Training and Evaluating Language Models with Template-based Data Generation | Nov 27, 2024 | Data AugmentationMath | CodeCode Available | 1 |
| Preference Optimization for Reasoning with Pseudo Feedback | Nov 25, 2024 | GSM8KMath | CodeCode Available | 2 |
| Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision | Nov 25, 2024 | Mathematical Reasoning | —Unverified | 0 |
| O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? | Nov 25, 2024 | HallucinationKnowledge Distillation | CodeCode Available | 7 |
| MC-NEST -- Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree | Nov 23, 2024 | Decision MakingMathematical Reasoning | CodeCode Available | 0 |
| Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation | Nov 22, 2024 | Knowledge DistillationMathematical Reasoning | —Unverified | 0 |
| Large Language Models for Combinatorial Optimization of Design Structure Matrix | Nov 19, 2024 | Combinatorial OptimizationMathematical Reasoning | —Unverified | 0 |
| Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models | Nov 19, 2024 | Mathematical Reasoning | —Unverified | 0 |