Reinforcement Learning Enhanced LLMs: A Survey Dec 5, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 3R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning May 5, 2025 Reinforcement Learning (RL)
Code Code Available 3DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Apr 15, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 3Deep Reinforcement Learning Oct 15, 2018 Deep Reinforcement Learning Management
Code Code Available 3Demystifying Long Chain-of-Thought Reasoning in LLMs Feb 5, 2025 Reinforcement Learning (RL)
Code Code Available 3R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO May 22, 2025 Reinforcement Learning (RL)
Code Code Available 3OpenSpiel: A Framework for Reinforcement Learning in Games Aug 26, 2019 General Reinforcement Learning reinforcement-learning
Code Code Available 3Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning May 22, 2025 Reinforcement Learning (RL)
Code Code Available 3Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning Feb 26, 2024 GPU Minecraft
Code Code Available 3OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning May 13, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 3Perception-R1: Pioneering Perception Policy with Reinforcement Learning Apr 10, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 3OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research May 16, 2023 Philosophy reinforcement-learning
Code Code Available 3o1-Coder: an o1 Replication for Coding Nov 29, 2024 Reinforcement Learning (RL)
Code Code Available 3On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning Nov 10, 2021 Multi-agent Reinforcement Learning reinforcement-learning
Code Code Available 3CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms Nov 16, 2021 Benchmarking Deep Reinforcement Learning
Code Code Available 3ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning May 19, 2025 Machine Translation reinforcement-learning
Code Code Available 3Adversarial Cheap Talk Nov 20, 2022 Meta-Learning Reinforcement Learning (RL)
Code Code Available 3Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Apr 3, 2025 Reinforcement Learning (RL)
Code Code Available 3Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning Feb 5, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 3Practical Deep Reinforcement Learning Approach for Stock Trading Nov 19, 2018 Deep Reinforcement Learning reinforcement-learning
Code Code Available 3MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library Oct 11, 2022 Multi-agent Reinforcement Learning reinforcement-learning
Code Code Available 3CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving May 15, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 3Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning Oct 11, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 3Learning Bipedal Walking for Humanoids with Current Feedback Mar 7, 2023 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 3Learning to Reason under Off-Policy Guidance Apr 21, 2025 Math Reinforcement Learning (RL)
Code Code Available 3MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse Mar 24, 2025 Layout Generation Reinforcement Learning (RL)
Code Code Available 3imitation: Clean Imitation Learning Implementations Nov 22, 2022 Imitation Learning reinforcement-learning
Code Code Available 3Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning Jan 26, 2023 Benchmarking Deep Reinforcement Learning
Code Code Available 3ACEGEN: Reinforcement learning of generative chemical agents for drug discovery May 7, 2024 Benchmarking Decision Making
Code Code Available 3A Clean Slate for Offline Reinforcement Learning Apr 15, 2025 Offline RL reinforcement-learning
Code Code Available 3Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward May 18, 2025 GPU Graph Matching
Code Code Available 3Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQL May 22, 2025 Natural Language Understanding Reinforcement Learning (RL)
Code Code Available 3ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Feb 29, 2024 Language Modeling Language Modelling
Code Code Available 3FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation May 22, 2023 Imitation Learning Motion Planning
Code Code Available 2Foundation Policies with Hilbert Representations Feb 23, 2024 Reinforcement Learning (RL) Unsupervised Pre-training
Code Code Available 2G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning May 19, 2025 Language Modeling Language Modelling
Code Code Available 2FlowReasoner: Reinforcing Query-Level Meta-Agents Apr 21, 2025 Reinforcement Learning (RL)
Code Code Available 2Generalized Inner Loop Meta-Learning Oct 3, 2019 Meta-Learning reinforcement-learning
Code Code Available 2FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation Apr 19, 2024 Decoder Network Embedding
Code Code Available 2Flightmare: A Flexible Quadrotor Simulator Sep 1, 2020 Deep Reinforcement Learning reinforcement-learning
Code Code Available 2FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance Dec 13, 2021 Deep Reinforcement Learning GPU
Code Code Available 2Flow: A Modular Learning Framework for Mixed Autonomy Traffic Oct 16, 2017 Autonomous Vehicles Deep Reinforcement Learning
Code Code Available 2Smooth Exploration for Robotic Reinforcement Learning May 12, 2020 continuous-control Continuous Control
Code Code Available 2Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods Mar 25, 2020 Distributed Computing Reinforcement Learning
Code Code Available 2Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design Oct 17, 2024 Protein Design Reinforcement Learning (RL)
Code Code Available 2Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Feb 10, 2025 Math Mathematical Reasoning
Code Code Available 2Feedback Efficient Online Fine-Tuning of Diffusion Models Feb 26, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 2AndroidEnv: A Reinforcement Learning Platform for Android May 27, 2021 reinforcement-learning Reinforcement Learning
Code Code Available 2EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and Benchmarking Apr 2, 2024 Benchmarking Reinforcement Learning (RL)
Code Code Available 2Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning Mar 19, 2024 Inductive Bias Reinforcement Learning (RL)
Code Code Available 2