| Continuous Control With Ensemble Deep Deterministic Policy Gradients | Nov 30, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Offline Reinforcement Learning via Inverse Optimization | Feb 27, 2025 | Model Predictive ControlMuJoCo | CodeCode Available | 0 | 5 |
| Imitation Learning from Purified Demonstrations | Oct 11, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 | 5 |
| Imitation Learning from Observations under Transition Model Disparity | Apr 25, 2022 | Imitation Learningmodel | CodeCode Available | 0 | 5 |
| Asynchronous Methods for Model-Based Reinforcement Learning | Oct 28, 2019 | modelModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| MuJoCo: A physics engine for model-based control | Oct 7, 2012 | modelMuJoCo | CodeCode Available | 0 | 5 |
| Off-Policy Average Reward Actor-Critic with Deterministic Policy Search | May 20, 2023 | MuJoCo | CodeCode Available | 0 | 5 |
| ORRB -- OpenAI Remote Rendering Backend | Jun 26, 2019 | MuJoCo | CodeCode Available | 0 | 5 |
| Context-Based Soft Actor Critic for Environments with Non-stationary Dynamics | May 7, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Weak Human Preference Supervision For Deep Reinforcement Learning | Jul 25, 2020 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression | May 28, 2024 | Imitation LearningMuJoCo | CodeCode Available | 0 | 5 |
| Constrained Intrinsic Motivation for Reinforcement Learning | Jul 12, 2024 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments | Mar 3, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning | Sep 17, 2019 | MuJoCoOpenAI Gym | CodeCode Available | 0 | 5 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 | 5 |
| Lyapunov-based Safe Policy Optimization for Continuous Control | Jan 28, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Online Reinforcement Learning in Non-Stationary Context-Driven Environments | Feb 4, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards | Dec 26, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization | Aug 13, 2023 | LEMMAMuJoCo | CodeCode Available | 0 | 5 |
| LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios | Oct 12, 2023 | Board GamesDecision Making | CodeCode Available | 0 | 5 |
| Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy | Jul 25, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| LLMs for sensory-motor control: Combining in-context and iterative learning | Jun 5, 2025 | MuJoCo | CodeCode Available | 0 | 5 |
| Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning | May 2, 2024 | Decision MakingMuJoCo | CodeCode Available | 0 | 5 |
| Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation | Jun 23, 2023 | Few-Shot Image ClassificationFew-Shot Imitation Learning | CodeCode Available | 0 | 5 |
| Learning to Play Cup-and-Ball with Noisy Camera Observations | Jul 19, 2020 | MuJoCo | CodeCode Available | 0 | 5 |
| Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards | Oct 10, 2019 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Handling Delay in Real-Time Reinforcement Learning | Mar 30, 2025 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Learning What To Do by Simulating the Past | Apr 8, 2021 | MuJoCo | CodeCode Available | 0 | 5 |
| Learning Powerful Policies by Using Consistent Dynamics Model | Jun 11, 2019 | Atari Gamesmodel | CodeCode Available | 0 | 5 |
| Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning | Jan 14, 2022 | modelMuJoCo | CodeCode Available | 0 | 5 |
| Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach | Oct 15, 2020 | Generative Adversarial NetworkMuJoCo | CodeCode Available | 0 | 5 |
| Learning non-Markovian Decision-Making from State-only Sequences | Jun 27, 2023 | Decision MakingImitation Learning | CodeCode Available | 0 | 5 |
| Learning Calibratable Policies using Programmatic Style-Consistency | Oct 2, 2019 | Imitation LearningMuJoCo | CodeCode Available | 0 | 5 |
| Action Robust Reinforcement Learning and Applications in Continuous Control | Jan 26, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies | Mar 14, 2023 | Decision MakingMuJoCo | CodeCode Available | 0 | 5 |
| GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation | Dec 17, 2023 | Imitation LearningMuJoCo | CodeCode Available | 0 | 5 |
| Language as an Abstraction for Hierarchical Deep Reinforcement Learning | Jun 18, 2019 | Deep Reinforcement LearningInstruction Following | CodeCode Available | 0 | 5 |
| Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation | Mar 27, 2025 | MuJoCoSMAC | CodeCode Available | 0 | 5 |
| Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | May 20, 2024 | Atari GamesMamba | CodeCode Available | 0 | 5 |
| Generalized Off-Policy Actor-Critic | Mar 27, 2019 | counterfactualMuJoCo | CodeCode Available | 0 | 5 |
| Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning | Nov 22, 2018 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Leveraging exploration in off-policy algorithms via normalizing flows | May 16, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Robust Reinforcement Learning via Adversarial training with Langevin Dynamics | Feb 14, 2020 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| Generalized Maximum Entropy Reinforcement Learning via Reward Shaping | Sep 29, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials | Feb 8, 2020 | MuJoCo | —Unverified | 0 | 0 |
| Coagent Networks: Generalized and Scaled | May 16, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Gaussian Process Policy Optimization | Mar 2, 2020 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| From proprioception to long-horizon planning in novel environments: A hierarchical RL model | Jun 11, 2020 | Efficient ExplorationModel Predictive Control | —Unverified | 0 | 0 |
| FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility | Oct 8, 2023 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |