| Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision | Nov 25, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection | Apr 13, 2025 | Answer SelectionAutomated Theorem Proving | —Unverified | 0 | 0 |
| Enhancing Mathematical Reasoning in LLMs by Stepwise Correction | Oct 16, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Enhancing Mathematical Reasoning in LLMs with Background Operators | Dec 5, 2024 | Data AugmentationMath | —Unverified | 0 | 0 |
| Enhancing Neural Mathematical Reasoning by Abductive Combination with Symbolic Library | Mar 28, 2022 | Logical ReasoningMathematical Reasoning | —Unverified | 0 | 0 |
| Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search | Jan 2, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles | May 26, 2025 | ARCLogical Reasoning | —Unverified | 0 | 0 |
| Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning | Dec 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Entropy-Aware Branching for Improved Mathematical Reasoning | Mar 27, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework | Jan 26, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection | Oct 6, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 | 0 |
| Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics | Apr 24, 2025 | Code GenerationMath | —Unverified | 0 | 0 |
| Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads | Jun 22, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Evaluating Robustness of Reward Models for Mathematical Reasoning | Oct 2, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering | Feb 14, 2025 | Mathematical ReasoningObject | —Unverified | 0 | 0 |
| Evaluation of LLMs for mathematical problem solving | May 30, 2025 | GSM8KMathematical Problem-Solving | —Unverified | 0 | 0 |
| Evaluation of OpenAI o1: Opportunities and Challenges of AGI | Sep 27, 2024 | Emotion RecognitionLarge Language Model | —Unverified | 0 | 0 |
| Evolutionary Pre-Prompt Optimization for Mathematical Reasoning | Dec 5, 2024 | Few-Shot LearningGSM8K | —Unverified | 0 | 0 |
| Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization | Feb 8, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Crossing the Reward Bridge: Expanding RL with Verifiable Rewards Across Diverse Domains | Mar 31, 2025 | Mathematical Reasoningreinforcement-learning | —Unverified | 0 | 0 |
| Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning | Oct 13, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding | Sep 13, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 | 0 |
| Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation | Apr 4, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data | Jun 4, 2024 | Mathematical ReasoningText Generation | —Unverified | 0 | 0 |
| Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal Interventions | Apr 29, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Exploring the Mystery of Influential Data for Mathematical Reasoning | Apr 1, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning | Jun 16, 2024 | BenchmarkingMath | —Unverified | 0 | 0 |
| Federated Prompting and Chain-of-Thought Reasoning for Improving LLMs Answering | Apr 27, 2023 | Mathematical Reasoning | —Unverified | 0 | 0 |
| FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning | Oct 8, 2024 | GSM8KHallucination | —Unverified | 0 | 0 |
| FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models | Mar 12, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together | Jul 15, 2024 | Arithmetic ReasoningLanguage Modeling | —Unverified | 0 | 0 |
| First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning | Nov 14, 2023 | GSM8KMath | —Unverified | 0 | 0 |
| Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning | Oct 29, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Foreword: A Computable Universe, Understanding Computation and Exploring Nature As Computation | May 25, 2012 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Formal Mathematical Reasoning: A New Frontier in AI | Dec 20, 2024 | Automated Theorem ProvingMath | —Unverified | 0 | 0 |
| Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs | Feb 12, 2024 | 2kMathematical Reasoning | —Unverified | 0 | 0 |
| From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks | Sep 6, 2024 | Machine TranslationMathematical Reasoning | —Unverified | 0 | 0 |
| From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education | Feb 19, 2025 | DiagnosticGSM8K | —Unverified | 0 | 0 |
| From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting | Dec 18, 2023 | DiversityGSM8K | —Unverified | 0 | 0 |
| From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs | Jan 27, 2025 | 4kMathematical Reasoning | —Unverified | 0 | 0 |
| FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI | Nov 7, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning | Feb 20, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| GAPS: Geometry-Aware Problem Solver | Jan 29, 2024 | Geometry Problem SolvingMath | —Unverified | 0 | 0 |
| GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning | Apr 1, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning | Dec 19, 2023 | Mathematical Reasoning | —Unverified | 0 | 0 |
| GFlowNet Fine-tuning for Diverse Correct Solutions in Mathematical Reasoning Tasks | Oct 26, 2024 | DiversityMathematical Reasoning | —Unverified | 0 | 0 |
| GoRA: Gradient-driven Adaptive Low Rank Adaptation | Feb 13, 2025 | Computational EfficiencyMathematical Reasoning | —Unverified | 0 | 0 |
| GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning | Oct 3, 2024 | Code GenerationIn-Context Learning | —Unverified | 0 | 0 |
| GraphMR: Graph Neural Network for Mathematical Reasoning | Nov 1, 2021 | Graph Neural NetworkGraph-to-Sequence | —Unverified | 0 | 0 |
| Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence | May 23, 2025 | GPULarge Language Model | —Unverified | 0 | 0 |