| Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal | Apr 23, 2019 | MathPosition | —Unverified | 0 |
| DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models | Oct 29, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces | Oct 13, 2024 | Computational EfficiencyMath | —Unverified | 0 |
| Accurate closed-form solution of the SIR epidemic model | Apr 16, 2020 | FormMath | —Unverified | 0 |
| SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning | May 16, 2025 | Math | —Unverified | 0 |
| Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics | Dec 4, 2020 | EthicsMath | —Unverified | 0 |
| DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images | Jan 24, 2025 | Math | —Unverified | 0 |
| Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model | Jun 30, 2025 | Math | —Unverified | 0 |
| An Improved Coarse-to-Fine Method for Solving Generation Tasks | Apr 1, 2019 | MathMath Word Problem Solving | —Unverified | 0 |
| A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications | Jan 9, 2025 | MathRAG | —Unverified | 0 |
| JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation | Oct 22, 2024 | Math | —Unverified | 0 |
| Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition | May 26, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education | Aug 1, 2019 | Math | —Unverified | 0 |
| Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers | Sep 1, 2017 | coreference-resolutionCoreference Resolution | —Unverified | 0 |
| Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology | Oct 19, 2024 | Logical ReasoningMath | —Unverified | 0 |
| Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models | Dec 11, 2023 | DiversityMath | —Unverified | 0 |
| Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment? | May 24, 2025 | Code GenerationMath | —Unverified | 0 |
| Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? | Apr 18, 2025 | MathVisual Reasoning | —Unverified | 0 |
| Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets | Apr 28, 2025 | Data AugmentationDiversity | —Unverified | 0 |
| Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning | Feb 21, 2025 | Math | —Unverified | 0 |
| Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models | Aug 15, 2024 | Math | —Unverified | 0 |
| Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning | Oct 8, 2024 | Image RetrievalMath | —Unverified | 0 |
| LeanTutor: A Formally-Verified AI Tutor for Mathematical Proofs | Jun 10, 2025 | Large Language ModelMath | —Unverified | 0 |
| Iterative Reasoning Preference Optimization | Apr 30, 2024 | ARCGSM8K | —Unverified | 0 |
| Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains | Jun 2, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |