| Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy | Jul 25, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction | Jul 20, 2022 | Imitation LearningMuJoCo | —Unverified | 0 |
| Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments | Jul 19, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Prompting Decision Transformer for Few-Shot Policy Generalization | Jun 27, 2022 | Few-Shot LearningInductive Bias | —Unverified | 0 |
| CGAR: Critic Guided Action Redistribution in Reinforcement Leaning | Jun 23, 2022 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 0 |
| Fighting Fire with Fire: Avoiding DNN Shortcuts through Priming | Jun 22, 2022 | Autonomous DrivingClassification | —Unverified | 0 |
| Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis | Jun 17, 2022 | MuJoCoStarcraft | —Unverified | 0 |
| Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning | Jun 15, 2022 | Autonomous Drivingcontinuous-control | —Unverified | 0 |
| Relative Policy-Transition Optimization for Fast Policy Transfer | Jun 13, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies | Jun 12, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Hybrid Value Estimation for Off-policy Evaluation and Offline Reinforcement Learning | Jun 4, 2022 | MuJoCoOff-policy evaluation | —Unverified | 0 |
| Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble | Jun 1, 2022 | Imitation LearningMuJoCo | —Unverified | 0 |
| Multi-Object Grasping in the Plane | Jun 1, 2022 | MuJoCoObject | —Unverified | 0 |
| TaSIL: Taylor Series Imitation Learning | May 30, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning | May 30, 2022 | Data PoisoningDeep Reinforcement Learning | CodeCode Available | 0 |
| SEREN: Knowing When to Explore and When to Exploit | May 30, 2022 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Data Valuation for Offline Reinforcement Learning | May 19, 2022 | Data ValuationDeep Reinforcement Learning | —Unverified | 0 |
| Imitation Learning from Observations under Transition Model Disparity | Apr 25, 2022 | Imitation Learningmodel | CodeCode Available | 0 |
| A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells | Apr 22, 2022 | MuJoCoOpen-Ended Question Answering | —Unverified | 0 |
| Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization | Apr 4, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| Hierarchical Reinforcement Learning of Locomotion Policies in Response to Approaching Objects: A Preliminary Study | Mar 20, 2022 | Deep Reinforcement LearningHierarchical Reinforcement Learning | —Unverified | 0 |
| Safe adaptation in multiagent competition | Mar 14, 2022 | MuJoCo | —Unverified | 0 |
| Context is Everything: Implicit Identification for Dynamics Adaptation | Mar 10, 2022 | MuJoCo | —Unverified | 0 |
| AutoDIME: Automatic Design of Interesting Multi-Agent Environments | Mar 4, 2022 | DiagnosticMuJoCo | —Unverified | 0 |
| A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data | Feb 28, 2022 | MuJoCo | —Unverified | 0 |
| User-Oriented Robust Reinforcement Learning | Feb 15, 2022 | MuJoCoreinforcement-learning | —Unverified | 0 |
| DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning | Jan 31, 2022 | continuous-controlContinuous Control | CodeCode Available | 0 |
| STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence | Jan 24, 2022 | MuJoCo | —Unverified | 0 |
| Recursive Least Squares Advantage Actor-Critic Algorithms | Jan 15, 2022 | Computational Efficiencycontinuous-control | —Unverified | 0 |
| Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning | Jan 14, 2022 | modelMuJoCo | —Unverified | 0 |
| Self Reward Design with Fine-grained Interpretability | Dec 30, 2021 | Deep Reinforcement LearningFairness | CodeCode Available | 0 |
| Multiagent Model-based Credit Assignment for Continuous Control | Dec 27, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning | Dec 14, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Continuous Control With Ensemble Deep Deterministic Policy Gradients | Nov 30, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning | Nov 19, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance | Nov 17, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Improving Learning from Demonstrations by Learning from Experience | Nov 16, 2021 | Imitation LearningMuJoCo | —Unverified | 0 |
| GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving | Nov 16, 2021 | Autonomous DrivingCARLA MAP Leaderboard | —Unverified | 0 |
| V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated Objects | Nov 7, 2021 | MuJoCoObject | —Unverified | 0 |
| Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods | Nov 6, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 |
| Smooth Imitation Learning via Smooth Costs and Smooth Policies | Nov 3, 2021 | continuous-controlContinuous Control | —Unverified | 0 |
| Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL | Oct 23, 2021 | Model Predictive ControlMuJoCo | —Unverified | 0 |
| CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric | Oct 20, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Balancing Value Underestimation and Overestimation with Realistic Actor-Critic | Oct 19, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 |
| On-Policy Model Errors in Reinforcement Learning | Oct 15, 2021 | modelMuJoCo | —Unverified | 0 |
| Wasserstein Unsupervised Reinforcement Learning | Oct 15, 2021 | Hierarchical Reinforcement LearningMuJoCo | —Unverified | 0 |
| Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation | Oct 9, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Generalized Maximum Entropy Reinforcement Learning via Reward Shaping | Sep 29, 2021 | MuJoCoreinforcement-learning | —Unverified | 0 |
| Auto-Encoding Inverse Reinforcement Learning | Sep 29, 2021 | Decision MakingImitation Learning | —Unverified | 0 |
| Distributional Decision Transformer for Hindsight Information Matching | Sep 29, 2021 | continuous-controlContinuous Control | —Unverified | 0 |