| Measurement to Meaning: A Validity-Centered Framework for AI Evaluation | May 13, 2025 | Math | —Unverified | 0 | 0 |
| Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning | Jun 7, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning. | Aug 1, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems | Oct 16, 2024 | HallucinationMath | —Unverified | 0 | 0 |
| Measuring Large Language Models Capacity to Annotate Journalistic Sourcing | Dec 30, 2024 | BenchmarkingEthics | —Unverified | 0 | 0 |
| Asymptotic behavior of mean fixation times in the Moran process with frequency-independent fitnesses | Dec 30, 2022 | Math | —Unverified | 0 | 0 |
| Mechanochemical models for calcium waves in embryonic epithelia | Nov 3, 2021 | Math | —Unverified | 0 | 0 |
| To Err is Machine: Vulnerability Detection Challenges LLM Reasoning | Mar 25, 2024 | Code GenerationIn-Context Learning | —Unverified | 0 | 0 |
| Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning | Feb 27, 2025 | MathMedical Question Answering | —Unverified | 0 | 0 |
| A Survey on Multimodal Large Language Models | Jun 23, 2023 | HallucinationIn-Context Learning | —Unverified | 0 | 0 |
| A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics | Feb 20, 2025 | Math | —Unverified | 0 | 0 |
| Translating a Math Word Problem to a Expression Tree | Oct 1, 2018 | Machine TranslationMath | —Unverified | 0 | 0 |
| Mental Stress Detection: Development and Evaluation of a Wearable In-Ear Plethysmography | Apr 12, 2024 | MathMental Stress Detection | —Unverified | 0 | 0 |
| Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving | May 20, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law | May 5, 2025 | MathMedical Diagnosis | —Unverified | 0 | 0 |
| A Survey of Question Answering for Math and Science Problem | May 10, 2017 | MathQuestion Answering | —Unverified | 0 | 0 |
| INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models | Sep 28, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges | Dec 16, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| A Study on Leveraging Search and Self-Feedback for Agent Reasoning | Feb 17, 2025 | Math | —Unverified | 0 | 0 |
| Metric-agnostic Ranking Optimization | Apr 17, 2023 | Information RetrievalLearning-To-Rank | —Unverified | 0 | 0 |
| MIaS: Math-Aware Retrieval in Digital Mathematical Libraries | Aug 28, 2018 | Information RetrievalMath | —Unverified | 0 | 0 |
| MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning | Oct 23, 2024 | MathMixture-of-Experts | —Unverified | 0 | 0 |
| A Study of PHOC Spatial Region Configurations for Math Formula Retrieval | Aug 17, 2024 | MathRetrieval | —Unverified | 0 | 0 |
| MIND: Math Informed syNthetic Dialogues for Pretraining LLMs | Oct 15, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Mind meets machine: Unravelling GPT-4's cognitive psychology | Mar 20, 2023 | Common Sense ReasoningDecision Making | —Unverified | 0 | 0 |