DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Jan 22, 2025 Mathematical Reasoning Multi-task Language Understanding
Code Code Available 15Introduction to Reinforcement Learning Aug 13, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 11Gymnasium: A Standard Interface for Reinforcement Learning Environments Jul 24, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 11SkyReels-V2: Infinite-length Film Generative Model Apr 17, 2025 Large Language Model model
Code Code Available 9VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Apr 10, 2025 Language Modeling Language Modelling
Code Code Available 9DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model May 7, 2024 Language Modeling Language Modelling
Code Code Available 9MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Jun 16, 2025 Mixture-of-Experts Reinforcement Learning (RL)
Code Code Available 7AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning May 30, 2025 GPU Math
Code Code Available 7An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents May 21, 2025 Reinforcement Learning (RL)
Code Code Available 7Flow-GRPO: Training Flow Matching Models via Online RL May 8, 2025 Denoising Diversity
Code Code Available 7RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning Apr 24, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 7TTRL: Test-Time Reinforcement Learning Apr 22, 2025 Math reinforcement-learning
Code Code Available 7SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Mar 24, 2025 Instruction Following Math
Code Code Available 7Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Mar 12, 2025 Question Answering RAG
Code Code Available 7Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Feb 20, 2025 Math reinforcement-learning
Code Code Available 7EvoRL: A GPU-accelerated Framework for Evolutionary Reinforcement Learning Jan 25, 2025 Benchmarking Evolutionary Algorithms
Code Code Available 7Kimi k1.5: Scaling Reinforcement Learning with LLMs Jan 22, 2025 Math reinforcement-learning
Code Code Available 7The Dormant Neuron Phenomenon in Deep Reinforcement Learning Feb 24, 2023 Deep Reinforcement Learning reinforcement-learning
Code Code Available 6FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning Nov 6, 2022 Deep Reinforcement Learning reinforcement-learning
Code Code Available 6RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism Jun 30, 2025 Question Answering RAG
Code Code Available 5LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Jun 23, 2025 Reinforcement Learning (RL) Text Generation
Code Code Available 5SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models Jun 15, 2025 Logical Reasoning Reinforcement Learning (RL)
Code Code Available 5ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models May 30, 2025 Reinforcement Learning (RL)
Code Code Available 5Group-in-Group Policy Optimization for LLM Agent Training May 16, 2025 GPU Mathematical Reasoning
Code Code Available 5DanceGRPO: Unleashing GRPO on Visual Generation May 12, 2025 Denoising reinforcement-learning
Code Code Available 5ZeroSearch: Incentivize the Search Capability of LLMs without Searching May 7, 2025 Reinforcement Learning (RL) Retrieval
Code Code Available 5Kimi-VL Technical Report Apr 10, 2025 Long-Context Understanding Mathematical Reasoning
Code Code Available 5Understanding R1-Zero-Like Training: A Critical Perspective Mar 26, 2025 Reinforcement Learning (RL)
Code Code Available 5Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models Mar 9, 2025 Math Multimodal Reasoning
Code Code Available 5Process Reinforcement through Implicit Rewards Feb 3, 2025 Math Reinforcement Learning (RL)
Code Code Available 5HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Dec 25, 2024 Reinforcement Learning (RL)
Code Code Available 5Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Nov 21, 2024 Reinforcement Learning (RL)
Code Code Available 5Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey Aug 19, 2024 Autonomous Driving Decision Making
Code Code Available 5Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation May 31, 2024 MuJoCo reinforcement-learning
Code Code Available 5Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation May 2, 2024 MuJoCo Reinforcement Learning (RL)
Code Code Available 5Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer Apr 8, 2024 MuJoCo Physical Simulations
Code Code Available 5Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments Jan 10, 2023 GPU Imitation Learning
Code Code Available 5EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine Jun 21, 2022 MuJoCo reinforcement-learning
Code Code Available 5Kwai Keye-VL Technical Report Jul 2, 2025 Instruction Following Reinforcement Learning (RL)
Code Code Available 4DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Jun 25, 2025 Code Generation Denoising
Code Code Available 4Skywork Open Reasoner 1 Technical Report May 28, 2025 Math Reinforcement Learning (RL)
Code Code Available 4QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning May 23, 2025 Question Answering Reinforcement Learning (RL)
Code Code Available 4Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO May 22, 2025 Domain Generalization Image Generation
Code Code Available 4s3: You Don't Need That Much Data to Train a Search Agent via RL May 20, 2025 RAG Reinforcement Learning (RL)
Code Code Available 4CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models May 18, 2025 Reinforcement Learning (RL)
Code Code Available 4T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT May 1, 2025 Image Generation Reinforcement Learning (RL)
Code Code Available 4Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models Apr 15, 2025 Humanoid Control Reinforcement Learning (RL)
Code Code Available 4DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments Apr 4, 2025 Navigate Prompt Engineering
Code Code Available 4Video-R1: Reinforcing Video Reasoning in MLLMs Mar 27, 2025 MVBench Reinforcement Learning (RL)
Code Code Available 4Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Mar 20, 2025 Decision Making Language Modeling
Code Code Available 4