| Template-Driven LLM-Paraphrased Framework for Tabular Math Word Problem Generation | Dec 20, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning | Dec 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Formal Mathematical Reasoning: A New Frontier in AI | Dec 20, 2024 | Automated Theorem ProvingMath | —Unverified | 0 |
| Offline Reinforcement Learning for LLM Multi-Step Reasoning | Dec 20, 2024 | GSM8KMath | CodeCode Available | 2 |
| What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning | Dec 20, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Qwen2.5 Technical Report | Dec 19, 2024 | Common Sense Reasoning | CodeCode Available | 13 |
| Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying | Dec 19, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| Channel Merging: Preserving Specialization for Merged Experts | Dec 18, 2024 | Code GenerationGPU | —Unverified | 0 |
| MetaRuleGPT: Recursive Numerical Reasoning of Language Models Trained with Simple Rules | Dec 18, 2024 | Mathematical ReasoningMeta-Learning | —Unverified | 0 |
| MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning | Dec 17, 2024 | Mathematical Reasoning | CodeCode Available | 0 |
| A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges | Dec 16, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Can Language Models Rival Mathematics Students? Evaluating Mathematical Reasoning through Textual Manipulation and Human Experiments | Dec 16, 2024 | Mathematical Reasoning | —Unverified | 0 |
| CoinMath: Harnessing the Power of Coding Instruction for Math LLMs | Dec 16, 2024 | DescriptiveMath | CodeCode Available | 0 |
| Entropy-Regularized Process Reward Model | Dec 15, 2024 | GSM8KMath | CodeCode Available | 1 |
| Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models | Dec 13, 2024 | Mathematical Reasoning | —Unverified | 0 |
| A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions | Dec 12, 2024 | GSM8KKnowledge Graphs | —Unverified | 0 |
| Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking | Dec 12, 2024 | Mathematical Reasoning | —Unverified | 0 |
| SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs | Dec 11, 2024 | ARCGSM8K | —Unverified | 0 |
| Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation | Dec 10, 2024 | Data AugmentationMathematical Reasoning | —Unverified | 0 |
| Applications of Positive Unlabeled (PU) and Negative Unlabeled (NU) Learning in Cybersecurity | Dec 9, 2024 | Intrusion DetectionMalware Detection | —Unverified | 0 |
| ProcessBench: Identifying Process Errors in Mathematical Reasoning | Dec 9, 2024 | GSM8KMath | CodeCode Available | 2 |
| TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action | Dec 7, 2024 | Depth EstimationMathematical Reasoning | CodeCode Available | 2 |
| Neuro-Symbolic Data Generation for Math Reasoning | Dec 6, 2024 | DiversityMath | —Unverified | 0 |
| Evolutionary Pre-Prompt Optimization for Mathematical Reasoning | Dec 5, 2024 | Few-Shot LearningGSM8K | —Unverified | 0 |
| Enhancing Mathematical Reasoning in LLMs with Background Operators | Dec 5, 2024 | Data AugmentationMath | —Unverified | 0 |
| Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Dec 4, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents | Dec 1, 2024 | Mathematical ReasoningMMLU | —Unverified | 0 |
| Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning | Nov 29, 2024 | Mathematical Reasoning | CodeCode Available | 2 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 |
| MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications | Nov 28, 2024 | document understandingMathematical Reasoning | —Unverified | 0 |
| Mars-PO: Multi-Agent Reasoning System Preference Optimization | Nov 28, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Training and Evaluating Language Models with Template-based Data Generation | Nov 27, 2024 | Data AugmentationMath | CodeCode Available | 1 |
| Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS | Nov 27, 2024 | In-Context LearningMath | CodeCode Available | 0 |
| Preference Optimization for Reasoning with Pseudo Feedback | Nov 25, 2024 | GSM8KMath | CodeCode Available | 2 |
| O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? | Nov 25, 2024 | HallucinationKnowledge Distillation | CodeCode Available | 7 |
| Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision | Nov 25, 2024 | Mathematical Reasoning | —Unverified | 0 |
| MC-NEST -- Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree | Nov 23, 2024 | Decision MakingMathematical Reasoning | CodeCode Available | 0 |
| Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation | Nov 22, 2024 | Knowledge DistillationMathematical Reasoning | —Unverified | 0 |
| Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models | Nov 19, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Large Language Models for Combinatorial Optimization of Design Structure Matrix | Nov 19, 2024 | Combinatorial OptimizationMathematical Reasoning | —Unverified | 0 |
| AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 2 |
| Enhancing LLM Reasoning with Reward-guided Tree Search | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 2 |
| PSPO*: An Effective Process-supervised Policy Optimization for Reasoning Alignment | Nov 18, 2024 | Mathematical Reasoning | CodeCode Available | 0 |
| Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection | Nov 13, 2024 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts | Nov 11, 2024 | Code GenerationGSM8K | CodeCode Available | 1 |
| Gap-Filling Prompting Enhances Code-Assisted Mathematical Reasoning | Nov 8, 2024 | Mathematical Reasoning | CodeCode Available | 0 |
| Kwai-STaR: Transform LLMs into State-Transition Reasoners | Nov 7, 2024 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| Benchmarking Large Language Models with Integer Sequence Generation Tasks | Nov 7, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI | Nov 7, 2024 | Mathematical Reasoning | —Unverified | 0 |
| MoD: A Distribution-Based Approach for Merging Large Language Models | Nov 1, 2024 | Mathematical Reasoning | CodeCode Available | 0 |