| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| TTRL: Test-Time Reinforcement Learning | Apr 22, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild | Mar 24, 2025 | Instruction FollowingMath | CodeCode Available | 7 |
| xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference | Mar 17, 2025 | MambaMath | CodeCode Available | 7 |
| S*: Test Time Scaling for Code Generation | Feb 20, 2025 | Code GenerationMath | CodeCode Available | 7 |
| Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning | Feb 20, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! | Feb 11, 2025 | Large Language ModelMath | CodeCode Available | 7 |
| Kimi k1.5: Scaling Reinforcement Learning with LLMs | Jan 22, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking | Jan 8, 2025 | Math | CodeCode Available | 7 |
| O1 Replication Journey: A Strategic Progress Report -- Part 1 | Oct 8, 2024 | Mathscientific discovery | CodeCode Available | 7 |