| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 |
| Simple Noisy Environment Augmentation for Reinforcement Learning | May 4, 2023 | Data AugmentationDiversity | CodeCode Available | 0 |
| Scaling Pareto-Efficient Decision Making Via Offline Multi-Objective RL | Apr 30, 2023 | Decision MakingMuJoCo | CodeCode Available | 1 |
| Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning | Apr 29, 2023 | Meta Reinforcement LearningMuJoCo | —Unverified | 0 |
| Feudal Graph Reinforcement Learning | Apr 11, 2023 | Decision MakingGraph Clustering | CodeCode Available | 0 |
| Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations | Mar 29, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies | Mar 14, 2023 | Decision MakingMuJoCo | CodeCode Available | 0 |
| Sample-efficient Adversarial Imitation Learning | Mar 14, 2023 | Decision MakingImitation Learning | —Unverified | 0 |
| Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint | Mar 8, 2023 | MuJoCo | —Unverified | 0 |
| Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills | Mar 7, 2023 | DiversityMuJoCo | CodeCode Available | 0 |
| A Strategy-Oriented Bayesian Soft Actor-Critic Model | Mar 7, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control | Mar 4, 2023 | MuJoCoQ-Learning | —Unverified | 0 |
| Decision Transformer under Random Frame Dropping | Mar 3, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting | Mar 2, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Meta-Reinforcement Learning via Exploratory Task Clustering | Feb 15, 2023 | ClusteringMeta Reinforcement Learning | —Unverified | 0 |
| When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning | Feb 15, 2023 | Autonomous Drivingcontinuous-control | CodeCode Available | 1 |
| Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning | Feb 13, 2023 | Imitation LearningMuJoCo | CodeCode Available | 1 |
| Order Matters: Agent-by-agent Policy Optimization | Feb 13, 2023 | MuJoCo | CodeCode Available | 1 |
| CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning | Feb 9, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority Influence | Feb 7, 2023 | Continuous ControlMuJoCo | CodeCode Available | 1 |
| Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization | Feb 5, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Online Reinforcement Learning in Non-Stationary Context-Driven Environments | Feb 4, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners | Feb 3, 2023 | DiversityMuJoCo | CodeCode Available | 1 |
| Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO) | Feb 1, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Neural Episodic Control with State Abstraction | Jan 27, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Certifiably Robust Reinforcement Learning through Model-Based Abstract Interpretation | Jan 26, 2023 | Adversarial RobustnessMuJoCo | —Unverified | 0 |
| Partial advantage estimator for proximal policy optimization | Jan 26, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 1 |
| Which Experiences Are Influential for Your Agent? Policy Iteration with Turn-over Dropout | Jan 26, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| Joint action loss for proximal policy optimization | Jan 26, 2023 | Dota 2MuJoCo | CodeCode Available | 1 |
| Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework | Jan 10, 2023 | Action ClassificationDecision Making | —Unverified | 0 |
| Genetic Imitation Learning by Reward Extrapolation | Jan 3, 2023 | Imitation LearningMuJoCo | —Unverified | 0 |
| Contextual Conservative Q-Learning for Offline Reinforcement Learning | Jan 3, 2023 | MuJoCoQ-Learning | —Unverified | 0 |
| Pontryagin Optimal Control via Neural Networks | Dec 30, 2022 | Model-based Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| On the Geometry of Reinforcement Learning in Continuous State and Action Spaces | Dec 29, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Offline Robot Reinforcement Learning with Uncertainty-Guided Human Expert Sampling | Dec 16, 2022 | MuJoCoQ-Learning | —Unverified | 0 |
| Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks | Dec 11, 2022 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble | Dec 7, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation | Dec 6, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets | Dec 5, 2022 | D4RLMuJoCo | CodeCode Available | 0 |
| Time-Efficient Reward Learning via Visually Assisted Cluster Ranking | Nov 30, 2022 | Dimensionality ReductionMuJoCo | —Unverified | 0 |
| Continuous Neural Algorithmic Planners | Nov 29, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Learning from Good Trajectories in Offline Multi-Agent Reinforcement Learning | Nov 28, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning | Nov 17, 2022 | MuJoCo | CodeCode Available | 0 |
| Contextual Transformer for Offline Meta Reinforcement Learning | Nov 15, 2022 | D4RLMeta Reinforcement Learning | —Unverified | 0 |
| Out-of-Dynamics Imitation Learning from Multimodal Demonstrations | Nov 13, 2022 | Imitation LearningMuJoCo | CodeCode Available | 0 |
| Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling | Nov 11, 2022 | MuJoCoNavigate | —Unverified | 0 |
| Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification | Nov 7, 2022 | MuJoCo | CodeCode Available | 1 |
| Reward Shaping Using Convolutional Neural Network | Oct 30, 2022 | MuJoCo | —Unverified | 0 |
| Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games | Oct 30, 2022 | Deep Reinforcement LearningImitation Learning | —Unverified | 0 |
| Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables | Oct 21, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 |