| Hybrid Value Estimation for Off-policy Evaluation and Offline Reinforcement Learning | Jun 4, 2022 | MuJoCoOff-policy evaluation | —Unverified | 0 |
| Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble | Jun 1, 2022 | Imitation LearningMuJoCo | —Unverified | 0 |
| Multi-Object Grasping in the Plane | Jun 1, 2022 | MuJoCoObject | —Unverified | 0 |
| TaSIL: Taylor Series Imitation Learning | May 30, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning | May 30, 2022 | Data PoisoningDeep Reinforcement Learning | CodeCode Available | 0 |
| SEREN: Knowing When to Explore and When to Exploit | May 30, 2022 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Data Valuation for Offline Reinforcement Learning | May 19, 2022 | Data ValuationDeep Reinforcement Learning | —Unverified | 0 |
| Imitation Learning from Observations under Transition Model Disparity | Apr 25, 2022 | Imitation Learningmodel | CodeCode Available | 0 |
| A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells | Apr 22, 2022 | MuJoCoOpen-Ended Question Answering | —Unverified | 0 |
| Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization | Apr 4, 2022 | continuous-controlContinuous Control | —Unverified | 0 |