| Instance-adaptive Zero-shot Chain-of-Thought Prompting | Sep 30, 2024 | GSM8KMath | —Unverified | 0 |
| Instruction-Following Pruning for Large Language Models | Jan 3, 2025 | Instruction FollowingMath | —Unverified | 0 |
| Integer Networks for Data Compression with Latent-Variable Models | May 1, 2019 | Data CompressionMath | —Unverified | 0 |
| Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving | Feb 12, 2025 | Mathmultimodal interaction | —Unverified | 0 |
| Interleaved Reasoning for Large Language Models via Reinforcement Learning | May 26, 2025 | Logical ReasoningMath | —Unverified | 0 |
| Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models | Dec 11, 2023 | DiversityMath | —Unverified | 0 |
| Interpretable Factorization for Neural Network ECG Models | Jun 26, 2020 | Math | —Unverified | 0 |
| Interpretable Math Word Problem Solution Generation Via Step-by-step Planning | Jun 1, 2023 | GSM8KLanguage Modeling | —Unverified | 0 |
| Intriguing Properties of Large Language and Vision Models | Oct 7, 2024 | cross-modal alignmentLarge Language Model | —Unverified | 0 |
| Introducing the Mathematics Meme Repository | Oct 19, 2021 | Math | —Unverified | 0 |
| Introduction to Coresets: Accurate Coresets | Oct 19, 2019 | Math | —Unverified | 0 |
| Investigating Large Language Models in Diagnosing Students' Cognitive Skills in Math Problem-solving | Apr 1, 2025 | Math | —Unverified | 0 |
| Investigating Math Word Problems using Pretrained Multilingual Language Models | Jan 16, 2022 | Machine TranslationMath | —Unverified | 0 |
| Investigating Symbolic Capabilities of Large Language Models | May 21, 2024 | MathNavigate | —Unverified | 0 |
| Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination | Jun 10, 2023 | MathMathematical Reasoning | —Unverified | 0 |
| Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting | Sep 30, 2023 | Math | —Unverified | 0 |
| Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation | Feb 18, 2025 | DiversityMath | —Unverified | 0 |
| IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations | Apr 1, 2024 | BenchmarkingMath | —Unverified | 0 |
| Solving Functional Optimization with Deep Networks and Variational Principles | Oct 8, 2024 | Math | —Unverified | 0 |
| Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs | Jan 21, 2025 | GSM8KIn-Context Learning | —Unverified | 0 |
| Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist | Jul 11, 2024 | GSM8KMath | —Unverified | 0 |
| Iterative Reasoning Preference Optimization | Apr 30, 2024 | ARCGSM8K | —Unverified | 0 |
| Yi-Lightning Technical Report | Dec 2, 2024 | ChatbotLarge Language Model | —Unverified | 0 |
| Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models | Jun 16, 2025 | Mathreinforcement-learning | —Unverified | 0 |
| JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation | Oct 22, 2024 | Math | —Unverified | 0 |
| Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning | Oct 8, 2024 | Image RetrievalMath | —Unverified | 0 |
| Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking | Mar 25, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data | Dec 20, 2018 | ClusteringMath | —Unverified | 0 |
| Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning | Mar 4, 2024 | GSM8KMath | —Unverified | 0 |
| Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities | May 21, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains | Jun 2, 2025 | MathReinforcement Learning (RL) | —Unverified | 0 |
| Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever | Jun 19, 2024 | MathSemantic Similarity | —Unverified | 0 |
| Knowledge Tagging with Large Language Model based Multi-Agent System | Sep 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Kokoyi: Executable LaTeX for End-to-end Deep Learning | Sep 29, 2021 | Deep LearningMath | —Unverified | 0 |
| L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models | Sep 29, 2023 | Code GenerationMath | —Unverified | 0 |
| Better Process Supervision with Bi-directional Rewarding Signals | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Adapting the LodView RDF Browser for Navigation over the Multilingual Linguistic Linked Open Data Cloud | Aug 28, 2022 | Math | —Unverified | 0 |
| Benchmarking Reasoning Robustness in Large Language Models | Mar 6, 2025 | BenchmarkingMath | —Unverified | 0 |
| THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models | Apr 17, 2025 | BenchmarkingMath | —Unverified | 0 |
| Tighter 'uniform bounds for Black-Scholes implied volatility' and the applications to root-finding | Feb 17, 2023 | Math | —Unverified | 0 |
| Language Models with Conformal Factuality Guarantees | Feb 15, 2024 | Conformal PredictionLanguage Modeling | —Unverified | 0 |
| TinyGSM: achieving >80% on GSM8k with small language models | Dec 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| YODA: Teacher-Student Progressive Learning for Language Models | Jan 28, 2024 | GSM8KMath | —Unverified | 0 |
| Large Language Models Are Struggle to Cope with Unreasonability in Math Problems | Mar 28, 2024 | Math | —Unverified | 0 |
| Large Language Models as Analogical Reasoners | Oct 3, 2023 | Code GenerationGSM8K | —Unverified | 0 |
| 1bit-Merging: Dynamic Quantized Merging for Large Language Models | Feb 15, 2025 | Code GenerationMath | —Unverified | 0 |
| Large Language Models Can Self-Correct with Key Condition Verification | May 23, 2024 | Arithmetic ReasoningMath | —Unverified | 0 |
| Large Language Models for Mathematical Reasoning: Progresses and Challenges | Jan 31, 2024 | DiversityMath | —Unverified | 0 |
| Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions | Aug 16, 2024 | DescriptiveHallucination | —Unverified | 0 |
| Large Language Models' Understanding of Math: Source Criticism and Extrapolation | Nov 12, 2023 | Automated Theorem ProvingMath | —Unverified | 0 |