| URLB: Unsupervised Reinforcement Learning Benchmark | Oct 28, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Towards Robust Bisimulation Metric Learning | Oct 27, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling | Oct 27, 2021 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Comparing Heuristics, Constraint Optimization, and Reinforcement Learning for an Industrial 2D Packing Problem | Oct 27, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Learning Diverse Policies in MOBA Games via Macro-Goals | Oct 27, 2021 | Deep Reinforcement LearningDiversity | —Unverified | 0 |
| Learning Domain Invariant Representations in Goal-conditioned Block MDPs | Oct 27, 2021 | Deep Reinforcement LearningDomain Generalization | CodeCode Available | 1 |
| Learning Collaborative Policies to Solve NP-hard Routing Problems | Oct 26, 2021 | Deep Reinforcement LearningTraveling Salesman Problem | CodeCode Available | 1 |
| The Difficulty of Passive Learning in Deep Reinforcement Learning | Oct 26, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective | Oct 26, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Accelerating Distributed Deep Reinforcement Learning by In-Network Experience Sampling | Oct 26, 2021 | Deep Reinforcement Learningreinforcement-learning | —Unverified | 0 |