| Stem-ming the Tide: Predicting STEM attrition using student transcript data | Aug 28, 2017 | BIG-bench Machine LearningMath | —Unverified | 0 | 0 |
| STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing | Nov 1, 2024 | 2kIn-Context Learning | —Unverified | 0 | 0 |
| Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo | Oct 2, 2024 | Math | —Unverified | 0 | 0 |
| xGen-small Technical Report | May 10, 2025 | DecoderMath | —Unverified | 0 | 0 |
| VideoGameBench: Can Vision-Language Models complete popular video games? | May 23, 2025 | Math | —Unverified | 0 | 0 |
| Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning | Oct 18, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback | Jan 18, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| A case study : Influence of Dimension Reduction on regression trees-based Algorithms -Predicting Aeronautics Loads of a Derivative Aircraft | Nov 16, 2018 | Dimensionality ReductionMath | —Unverified | 0 | 0 |
| Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation | May 22, 2023 | Knowledge TracingMath | —Unverified | 0 | 0 |
| Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards | Jun 13, 2025 | MathNavigate | —Unverified | 0 | 0 |
| A Careful Examination of Large Language Model Performance on Grade School Arithmetic | May 1, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Strictly monotone mean-variance preferences with applications to portfolio selection | Dec 18, 2024 | ManagementMath | —Unverified | 0 | 0 |
| StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs | Dec 23, 2024 | BenchmarkingLogical Reasoning | —Unverified | 0 | 0 |
| A Bayesian model for recognizing handwritten mathematical expressions | Sep 18, 2014 | Mathmodel | —Unverified | 0 | 0 |
| Students' Perceived Roles, Opportunities, and Challenges of a Generative AI-powered Teachable Agent: A Case of Middle School Math Class | Aug 26, 2024 | Math | —Unverified | 0 | 0 |
| VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM | Nov 8, 2024 | Math | —Unverified | 0 | 0 |
| Subtle Errors Matter: Preference Learning via Error-injected Self-editing | Oct 9, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications | Jan 9, 2025 | MathRAG | —Unverified | 0 | 0 |
| Supervised Optimism Correction: Be Confident When LLMs Are Sure | Apr 10, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Sustainable Border Control Policy in the COVID-19 Pandemic: A Math Modeling Study | Aug 28, 2020 | Math | —Unverified | 0 | 0 |
| SVM-based Deep Stacking Networks | Feb 15, 2019 | Math | —Unverified | 0 | 0 |
| SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution | Feb 25, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Visual Analytics of Student Learning Behaviors on K-12 Mathematics E-learning Platforms | Sep 7, 2019 | Math | —Unverified | 0 | 0 |
| Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning | Mar 7, 2025 | GPUMath | —Unverified | 0 | 0 |
| Advancing Process Verification for Large Language Models via Tree-Based Preference Learning | Jun 29, 2024 | Binary ClassificationGSM8K | —Unverified | 0 | 0 |
| Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use | Apr 7, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Chimera: Improving Generalist Model with Domain-Specific Experts | Dec 8, 2024 | Mathmodel | —Unverified | 0 | 0 |
| Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages | Jan 23, 2025 | Instruction FollowingMath | —Unverified | 0 | 0 |
| Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language | May 22, 2020 | ClassificationClustering | —Unverified | 0 | 0 |
| Class Prototypes Based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos | Jan 1, 2023 | Contrastive LearningMath | —Unverified | 0 | 0 |
| Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning | Jan 25, 2025 | Math | —Unverified | 0 | 0 |
| SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis | Jun 2, 2025 | 8kMath | —Unverified | 0 | 0 |
| ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data | Mar 1, 2024 | Math | —Unverified | 0 | 0 |
| CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer | Jun 13, 2024 | Domain GeneralizationKnowledge Tracing | —Unverified | 0 | 0 |
| CMATH: Can Your Language Model Pass Chinese Elementary School Math Test? | Jun 29, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models | Jun 28, 2024 | DiversityMath | —Unverified | 0 | 0 |
| ChemistryQA: A Complex Question Answering Dataset from Chemistry | Jan 1, 2021 | Machine Reading ComprehensionMath | —Unverified | 0 | 0 |
| Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data | Mar 13, 2025 | Large Language ModelMath | —Unverified | 0 | 0 |
| CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning | Oct 3, 2024 | GSM8KLanguage Modeling | —Unverified | 0 | 0 |
| Code Pretraining Improves Entity Tracking Abilities of Language Models | May 31, 2024 | Math | —Unverified | 0 | 0 |
| Cognitive network science reveals bias in GPT-3, ChatGPT, and GPT-4 mirroring math anxiety in high-school students | May 22, 2023 | MathText Generation | —Unverified | 0 | 0 |
| Cognitive Noise and Altruistic Preferences | Oct 10, 2024 | Math | —Unverified | 0 | 0 |
| System-2 Mathematical Reasoning via Enriched Instruction Tuning | Dec 22, 2024 | ERPGSM8K | —Unverified | 0 | 0 |
| Complementing the Linear-Programming Learning Experience with the Design and Use of Computerized Games: The Formula 1 Championship Game | Sep 19, 2021 | Math | —Unverified | 0 | 0 |
| Complexity-Based Prompting for Multi-Step Reasoning | Oct 3, 2022 | Date UnderstandingGSM8K | —Unverified | 0 | 0 |
| Composing Ensembles of Pre-trained Models via Iterative Consensus | Oct 20, 2022 | Arithmetic ReasoningImage Generation | —Unverified | 0 | 0 |
| Compositional Causal Reasoning Evaluation in Language Models | Mar 6, 2025 | Math | —Unverified | 0 | 0 |
| ComSearch: Equation Searching with Combinatorial Mathematics for Solving Math Word Problems with Weak Supervision | Nov 16, 2021 | Math | —Unverified | 0 | 0 |
| ComSearch: Equation Searching with Combinatorial Mathematics for Solving Math Word Problems with Weak Supervision | Jan 16, 2022 | Math | —Unverified | 0 | 0 |
| Tackling Math Word Problems with Fine-to-Coarse Abstracting and Reasoning | May 17, 2022 | Math | —Unverified | 0 | 0 |