| ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages | Jun 2, 2023 | Bayesian Inferencecontinuous-control | CodeCode Available | 0 |
| MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL | May 31, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem | May 26, 2023 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| Inverse Reinforcement Learning with the Average Reward Criterion | May 24, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 |
| OER: Offline Experience Replay for Continual Offline Reinforcement Learning | May 23, 2023 | Continual LearningMuJoCo | —Unverified | 0 |
| TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching | May 22, 2023 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |
| Unsupervised Discovery of Continuous Skills on a Sphere | May 21, 2023 | MuJoCoUnsupervised Reinforcement Learning | —Unverified | 0 |
| Off-Policy Average Reward Actor-Critic with Deterministic Policy Search | May 20, 2023 | MuJoCo | CodeCode Available | 0 |
| Client Selection for Federated Policy Optimization with Environment Heterogeneity | May 18, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 |
| Coagent Networks: Generalized and Scaled | May 16, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback | May 13, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| DEFENDER: DTW-Based Episode Filtering Using Demonstrations for Enhancing RL Safety | May 8, 2023 | MuJoCo | —Unverified | 0 |
| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 |
| Simple Noisy Environment Augmentation for Reinforcement Learning | May 4, 2023 | Data AugmentationDiversity | CodeCode Available | 0 |
| Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning | Apr 29, 2023 | Meta Reinforcement LearningMuJoCo | —Unverified | 0 |
| Feudal Graph Reinforcement Learning | Apr 11, 2023 | Decision MakingGraph Clustering | CodeCode Available | 0 |
| Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations | Mar 29, 2023 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies | Mar 14, 2023 | Decision MakingMuJoCo | CodeCode Available | 0 |
| Sample-efficient Adversarial Imitation Learning | Mar 14, 2023 | Decision MakingImitation Learning | —Unverified | 0 |
| Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint | Mar 8, 2023 | MuJoCo | —Unverified | 0 |
| A Strategy-Oriented Bayesian Soft Actor-Critic Model | Mar 7, 2023 | continuous-controlContinuous Control | —Unverified | 0 |
| Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills | Mar 7, 2023 | DiversityMuJoCo | CodeCode Available | 0 |
| Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control | Mar 4, 2023 | MuJoCoQ-Learning | —Unverified | 0 |
| Decision Transformer under Random Frame Dropping | Mar 3, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |