| Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model Reasoning | May 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Rewriting Pre-Training Data Boosts LLM Performance in Math and Code | May 5, 2025 | Code GenerationGSM8K | CodeCode Available | 1 |
| NeMo-Inspector: A Visualization Tool for LLM Generation Analysis | May 1, 2025 | GSM8KMath | CodeCode Available | 1 |
| DeepCritic: Deliberate Critique with Large Language Models | May 1, 2025 | Math | CodeCode Available | 1 |
| Efficient Reasoning for LLMs through Speculative Chain-of-Thought | Apr 27, 2025 | GSM8KMath | CodeCode Available | 1 |
| Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency | Apr 24, 2025 | BenchmarkingMath | CodeCode Available | 1 |
| Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation | Apr 15, 2025 | MathQuantum Machine Learning | CodeCode Available | 1 |
| M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models | Apr 14, 2025 | MambaMath | CodeCode Available | 1 |
| Efficient Process Reward Model Training via Active Learning | Apr 14, 2025 | Active LearningMath | CodeCode Available | 1 |
| The Jailbreak Tax: How Useful are Your Jailbreak Outputs? | Apr 14, 2025 | Math | CodeCode Available | 1 |