| Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision | Nov 25, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory | Apr 18, 2024 | Machine TranslationMathematical Reasoning | —Unverified | 0 |
| Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics | Apr 1, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning | Oct 14, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Eliciting Reasoning in Language Models with Cognitive Tools | Jun 13, 2025 | Mathematical ReasoningReinforcement Learning (RL) | —Unverified | 0 |
| Efficient Tool Use with Chain-of-Abstraction Reasoning | Jan 30, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Bottlenecked Transformers: Periodic KV Cache Abstraction for Generalised Reasoning | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets | Apr 28, 2025 | Data AugmentationDiversity | —Unverified | 0 |
| MathDivide: Improved mathematical reasoning by large language models | May 12, 2024 | GSM8KLogical Reasoning | —Unverified | 0 |
| Efficient Model-agnostic Alignment via Bayesian Persuasion | May 29, 2024 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation | Aug 28, 2024 | Knowledge DistillationLanguage Modelling | —Unverified | 0 |
| Efficient Long CoT Reasoning in Small Language Models | May 24, 2025 | Mathematical Reasoningvalid | —Unverified | 0 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 |
| Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability | May 29, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Agent-as-a-Service based on Agent Network | May 13, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| LemmaHead: RAG Assisted Proof Generation Using Large Language Models | Jan 27, 2025 | Automated Theorem ProvingMathematical Proofs | —Unverified | 0 |
| Dynamic Sampling that Adapts: Iterative DPO for Self-Aware Mathematical Reasoning | May 22, 2025 | Mathematical ReasoningReinforcement Learning (RL) | —Unverified | 0 |
| Apriori Knowledge in an Era of Computational Opacity: The Role of AI in Mathematical Discovery | Mar 15, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Learning to Reason With Relational Abstractions | Oct 6, 2022 | Mathematical Reasoning | —Unverified | 0 |
| Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision | May 21, 2025 | GSM8KLearning-To-Rank | —Unverified | 0 |
| Machine learning and information theory concepts towards an AI Mathematician | Mar 7, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Let's Reinforce Step by Step | Nov 10, 2023 | GSM8KLogical Reasoning | —Unverified | 0 |
| Let's reward step by step: Step-Level reward model as the Navigators for Reasoning | Oct 16, 2023 | Code GenerationGSM8K | —Unverified | 0 |
| MAPS: A Multilingual Benchmark for Global Agent Performance and Security | May 21, 2025 | Code GenerationMath | —Unverified | 0 |
| DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models | Oct 29, 2024 | MathMathematical Reasoning | —Unverified | 0 |