| Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers | May 19, 2025 | In-Context LearningInstruction Following | —Unverified | 0 |
| AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models | May 25, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data | Jun 4, 2024 | Mathematical ReasoningText Generation | —Unverified | 0 |
| Can Theoretical Physics Research Benefit from Language Agents? | Jun 6, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities | Dec 22, 2023 | ChatbotGSM8K | —Unverified | 0 |
| Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation | Apr 4, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding | Sep 13, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning | May 20, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models | Oct 2, 2024 | Cross-Lingual TransferMath | —Unverified | 0 |
| Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning | Oct 13, 2024 | MathMathematical Reasoning | —Unverified | 0 |