| TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching | May 22, 2023 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |
| Unsupervised Discovery of Continuous Skills on a Sphere | May 21, 2023 | MuJoCoUnsupervised Reinforcement Learning | —Unverified | 0 |
| Off-Policy Average Reward Actor-Critic with Deterministic Policy Search | May 20, 2023 | MuJoCo | CodeCode Available | 0 |
| Client Selection for Federated Policy Optimization with Environment Heterogeneity | May 18, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 |
| Coagent Networks: Generalized and Scaled | May 16, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback | May 13, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| DEFENDER: DTW-Based Episode Filtering Using Demonstrations for Enhancing RL Safety | May 8, 2023 | MuJoCo | —Unverified | 0 |
| Explaining RL Decisions with Trajectories | May 6, 2023 | Attributecontinuous-control | CodeCode Available | 0 |
| Simple Noisy Environment Augmentation for Reinforcement Learning | May 4, 2023 | Data AugmentationDiversity | CodeCode Available | 0 |