| Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval | Mar 21, 2022 | Information RetrievalMath | CodeCode Available | 0 |
| MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training | Feb 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models | May 26, 2025 | Contrastive LearningMath | CodeCode Available | 0 |
| Effects of structure on reasoning in instance-level Self-Discover | Jul 4, 2025 | Math | CodeCode Available | 0 |
| Mapping to Declarative Knowledge for Word Problem Solving | Dec 26, 2017 | MathTranslation | CodeCode Available | 0 |
| NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models | Jun 5, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| MARGE: Improving Math Reasoning for LLMs with Guided Exploration | May 18, 2025 | MathMathematical Reasoning | CodeCode Available | 0 |
| Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior | Jul 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators | Apr 21, 2025 | Code GenerationInstruction Following | CodeCode Available | 0 |
| Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing | Oct 2, 2024 | Contrastive LearningKnowledge Tracing | CodeCode Available | 0 |
| Efficient Non-Parametric Optimizer Search for Diverse Tasks | Sep 27, 2022 | AutoMLMath | CodeCode Available | 0 |
| Heteroclinic cycling and extinction in May-Leonard models with demographic stochasticity | Nov 10, 2021 | MathUnity | CodeCode Available | 0 |
| Deterministic and Nondeterministic Particle Motion with Interaction Mechanisms | Dec 31, 2022 | Math | CodeCode Available | 0 |
| ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving | Jan 14, 2025 | GSM8KMath | CodeCode Available | 0 |
| LM^2: A Simple Society of Language Models Solves Complex Reasoning | Apr 2, 2024 | MathMedQA | CodeCode Available | 0 |
| AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control | Jun 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Textual Enhanced Contrastive Learning for Solving Math Word Problems | Nov 29, 2022 | Contrastive LearningMath | CodeCode Available | 0 |
| ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization | Jun 12, 2025 | Math | CodeCode Available | 0 |
| How Do Humans Write Code? Large Models Do It the Same Way Too | Feb 24, 2024 | Code GenerationMath | CodeCode Available | 0 |
| Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls | Feb 16, 2025 | Computational EfficiencyGSM8K | CodeCode Available | 0 |
| How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark | May 24, 2025 | Math | CodeCode Available | 0 |
| How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study | May 21, 2025 | Math | CodeCode Available | 0 |
| World Models for Math Story Problems | Jun 7, 2023 | Math | CodeCode Available | 0 |
| One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks | Oct 14, 2024 | FairnessGSM8K | CodeCode Available | 0 |
| ChatBench: From Static Benchmarks to Human-AI Evaluation | Mar 22, 2025 | MathMMLU | CodeCode Available | 0 |
| Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math Textbooks | Jul 30, 2023 | MathOptical Character Recognition | CodeCode Available | 0 |
| When an LLM is apprehensive about its answers -- and when its uncertainty is justified | Mar 3, 2025 | MathMMLU | CodeCode Available | 0 |
| Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? | May 10, 2024 | Mathtext similarity | CodeCode Available | 0 |
| Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential Privacy | Dec 8, 2022 | Federated LearningMath | CodeCode Available | 0 |
| Classifying Math KCs via Task-Adaptive Pre-Trained BERT | May 24, 2021 | MathPrediction | CodeCode Available | 0 |
| Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving | Nov 1, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 0 |
| ATHENA: Mathematical Reasoning with Thought Expansion | Nov 2, 2023 | MathMathematical Reasoning | CodeCode Available | 0 |
| DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction | May 20, 2024 | DiagnosticMath | CodeCode Available | 0 |
| Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation | Feb 17, 2025 | Knowledge DistillationMath | CodeCode Available | 0 |
| Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure | Oct 3, 2024 | Math | CodeCode Available | 0 |
| Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings | May 19, 2025 | HumanEvalMath | CodeCode Available | 0 |
| Analysis of Optimization Algorithms via Sum-of-Squares | Jun 11, 2019 | Math | CodeCode Available | 0 |
| Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning | Jun 5, 2025 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| Improving Compositional Generalization in Math Word Problem Solving | Sep 3, 2022 | Data AugmentationMath | CodeCode Available | 0 |
| Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges | Feb 12, 2025 | GSM8KMath | CodeCode Available | 0 |
| Mathematics Content Understanding for Cyberlearning via Formula Evolution Map | Dec 31, 2018 | Graph MiningMath | CodeCode Available | 0 |
| Analogical Math Word Problems Solving with Enhanced Problem-Solution Association | Dec 1, 2022 | MathQuestion Answering | CodeCode Available | 0 |
| Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying | Dec 19, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| Small Language Models Need Strong Verifiers to Self-Correct Reasoning | Apr 26, 2024 | Math | CodeCode Available | 0 |
| SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models | Mar 12, 2024 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| OntoMath^PRO Ontology: A Linked Data Hub for Mathematics | Jul 17, 2014 | Math | CodeCode Available | 0 |
| Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math Competitions | May 24, 2025 | Automated Theorem ProvingMath | CodeCode Available | 0 |
| In-Context Principle Learning from Mistakes | Feb 8, 2024 | GSM8KIn-Context Learning | CodeCode Available | 0 |
| Incorporating Graph Attention Mechanism into Geometric Problem Solving Based on Deep Reinforcement Learning | Mar 14, 2024 | Deep Reinforcement LearningGraph Attention | CodeCode Available | 0 |
| Smart Vision-Language Reasoners | Jul 5, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |