| Tighter 'uniform bounds for Black-Scholes implied volatility' and the applications to root-finding | Feb 17, 2023 | Math | —Unverified | 0 | 0 |
| Language Models with Conformal Factuality Guarantees | Feb 15, 2024 | Conformal PredictionLanguage Modeling | —Unverified | 0 | 0 |
| TinyGSM: achieving >80% on GSM8k with small language models | Dec 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 | 0 |
| YODA: Teacher-Student Progressive Learning for Language Models | Jan 28, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Large Language Models Are Struggle to Cope with Unreasonability in Math Problems | Mar 28, 2024 | Math | —Unverified | 0 | 0 |
| Large Language Models as Analogical Reasoners | Oct 3, 2023 | Code GenerationGSM8K | —Unverified | 0 | 0 |
| 1bit-Merging: Dynamic Quantized Merging for Large Language Models | Feb 15, 2025 | Code GenerationMath | —Unverified | 0 | 0 |
| Large Language Models Can Self-Correct with Key Condition Verification | May 23, 2024 | Arithmetic ReasoningMath | —Unverified | 0 | 0 |
| Large Language Models for Mathematical Reasoning: Progresses and Challenges | Jan 31, 2024 | DiversityMath | —Unverified | 0 | 0 |
| Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions | Aug 16, 2024 | DescriptiveHallucination | —Unverified | 0 | 0 |
| Large Language Models' Understanding of Math: Source Criticism and Extrapolation | Nov 12, 2023 | Automated Theorem ProvingMath | —Unverified | 0 | 0 |
| Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving | Jan 28, 2025 | MathMathematical Problem-Solving | —Unverified | 0 | 0 |
| Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH | Jan 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning | Dec 7, 2023 | In-Context LearningMath | —Unverified | 0 | 0 |
| Benchmarking and Improving Generator-Validator Consistency of Language Models | Oct 3, 2023 | BenchmarkingInstruction Following | —Unverified | 0 | 0 |
| Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models | Oct 2, 2024 | Cross-Lingual TransferMath | —Unverified | 0 | 0 |
| Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks | Mar 14, 2024 | MathSkill Generalization | —Unverified | 0 | 0 |
| BeamLoRA: Beam-Constraint Low-Rank Adaptation | Feb 19, 2025 | Code GenerationMath | —Unverified | 0 | 0 |
| Basic concepts, definitions, and methods in D number theory | Mar 21, 2020 | Math | —Unverified | 0 | 0 |
| Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization | Feb 18, 2025 | Math | —Unverified | 0 | 0 |
| Backward bifurcation and saddle-node bifurcation in virus-immune dynamics | Dec 1, 2021 | Math | —Unverified | 0 | 0 |
| Learning Autonomous Code Integration for Math Language Models | Feb 2, 2025 | Math | —Unverified | 0 | 0 |
| Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs | May 24, 2024 | In-Context LearningLanguage Modeling | —Unverified | 0 | 0 |
| Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval | Nov 25, 2024 | MathMath Word Problem Solving | —Unverified | 0 | 0 |
| Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models | Jul 12, 2024 | GSM8KMath | —Unverified | 0 | 0 |