| Efficient Fine-Tuning of Quantized Models via Adaptive Rank and Bitwidth | May 2, 2025 | GSM8KQuantization | —Unverified | 0 |
| TutorGym: A Testbed for Evaluating AI Agents as Tutors and Students | May 2, 2025 | GSM8KIn-Context Learning | CodeCode Available | 0 |
| NeMo-Inspector: A Visualization Tool for LLM Generation Analysis | May 1, 2025 | GSM8KMath | CodeCode Available | 1 |
| Local Prompt Optimization | Apr 29, 2025 | GSM8KMath | —Unverified | 0 |
| Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition | Apr 29, 2025 | GSM8KKnowledge Distillation | —Unverified | 0 |
| AutoJudge: Judge Decoding Without Manual Annotation | Apr 28, 2025 | GSM8KLarge Language Model | —Unverified | 0 |
| Efficient Reasoning for LLMs through Speculative Chain-of-Thought | Apr 27, 2025 | GSM8KMath | CodeCode Available | 1 |
| Training Large Language Models to Reason via EM Policy Gradient | Apr 24, 2025 | GSM8KMath | —Unverified | 0 |
| Dynamic Early Exit in Reasoning Models | Apr 22, 2025 | GSM8KMath | CodeCode Available | 2 |
| Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning | Apr 18, 2025 | AllGSM8K | —Unverified | 0 |