| Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation | May 30, 2024 | Code GenerationHumanEval | —Unverified | 0 | 0 |
| dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures | Apr 5, 2016 | GPUManagement | —Unverified | 0 | 0 |
| dMath: Distributed Linear Algebra for DL | Nov 19, 2016 | GPUManagement | —Unverified | 0 | 0 |
| Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models | Aug 15, 2024 | Math | —Unverified | 0 | 0 |
| Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning | Feb 21, 2025 | Math | —Unverified | 0 | 0 |
| Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? | Apr 18, 2025 | MathVisual Reasoning | —Unverified | 0 | 0 |
| Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment? | May 24, 2025 | Code GenerationMath | —Unverified | 0 | 0 |
| TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models | Jul 12, 2024 | Code GenerationMath | —Unverified | 0 | 0 |
| Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology | Oct 19, 2024 | Logical ReasoningMath | —Unverified | 0 | 0 |
| Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education | Aug 1, 2019 | Math | —Unverified | 0 | 0 |
| Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition | May 26, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Temperature and Persona Shape LLM Agent Consensus With Minimal Accuracy Gains in Qualitative Coding | Jul 15, 2025 | Math | —Unverified | 0 | 0 |
| Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning | May 27, 2025 | Math | —Unverified | 0 | 0 |
| Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model | Jun 30, 2025 | Math | —Unverified | 0 | 0 |
| DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images | Jan 24, 2025 | Math | —Unverified | 0 | 0 |
| Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models | Jan 10, 2025 | Math | —Unverified | 0 | 0 |
| Can you hear me now? Sensitive comparisons of human and machine perception | Mar 27, 2020 | Mathspeech-recognition | —Unverified | 0 | 0 |
| Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces | Oct 13, 2024 | Computational EfficiencyMath | —Unverified | 0 | 0 |
| DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models | Oct 29, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks | Oct 2, 2024 | MathNavigate | —Unverified | 0 | 0 |
| Testing GPT-4-o1-preview on math and science problems: A follow-up study | Oct 11, 2024 | MathSpatial Reasoning | —Unverified | 0 | 0 |
| Dynamic Scheduling of MPI-based Distributed Deep Learning Training Jobs | Aug 21, 2019 | Deep LearningMath | —Unverified | 0 | 0 |
| Dynamic Skill Adaptation for Large Language Models | Dec 26, 2024 | Math | —Unverified | 0 | 0 |
| Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems | Aug 10, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| EasyMath: A 0-shot Math Benchmark for SLMs | May 20, 2025 | Math | —Unverified | 0 | 0 |