| Matryoshka Model Learning for Improved Elastic Student Models | May 29, 2025 | LAMBADAMath | —Unverified | 0 | 0 |
| Asymptotic expression for the fixation probability of a mutant in star graphs | Mar 18, 2016 | Math | —Unverified | 0 | 0 |
| Maximizing Confidence Alone Improves Reasoning | May 28, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning | Apr 9, 2025 | Code GenerationDiversity | —Unverified | 0 | 0 |
| Training Large Language Models to Reason via EM Policy Gradient | Apr 24, 2025 | GSM8KMath | —Unverified | 0 | 0 |
| Measurement to Meaning: A Validity-Centered Framework for AI Evaluation | May 13, 2025 | Math | —Unverified | 0 | 0 |
| Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning | Jun 7, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning. | Aug 1, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems | Oct 16, 2024 | HallucinationMath | —Unverified | 0 | 0 |
| Measuring Large Language Models Capacity to Annotate Journalistic Sourcing | Dec 30, 2024 | BenchmarkingEthics | —Unverified | 0 | 0 |