| Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT? | Apr 16, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| A Dual-Space Framework for General Knowledge Distillation of Large Language Models | Apr 15, 2025 | Code GenerationGeneral Knowledge | CodeCode Available | 1 |
| Teaching Large Language Models to Reason through Learning and Forgetting | Apr 15, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Breaking the Data Barrier -- Building GUI Agents Through Task Generalization | Apr 14, 2025 | Mathematical ReasoningMultimodal Reasoning | CodeCode Available | 1 |
| GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models | Apr 13, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining | Apr 10, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 1 |
| Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization | Apr 9, 2025 | Logical ReasoningMathematical Reasoning | CodeCode Available | 1 |
| Boosting MLLM Reasoning with Text-Debiased Hint-GRPO | Mar 31, 2025 | Mathematical ReasoningMultimodal Reasoning | CodeCode Available | 1 |
| R-PRM: Reasoning-Driven Process Reward Modeling | Mar 27, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training | Mar 24, 2025 | DiversityLarge Language Model | CodeCode Available | 1 |