| BiERL: A Meta Evolutionary Reinforcement Learning Framework via Bilevel Optimization | Aug 1, 2023 | Bilevel OptimizationDiversity | CodeCode Available | 0 | 5 |
| Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning | Oct 15, 2024 | D4RLModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online | Nov 19, 2019 | Continual Learningcontinuous-control | CodeCode Available | 0 | 5 |
| PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning | Feb 23, 2025 | Action GenerationDecision Making | CodeCode Available | 0 | 5 |
| Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning | May 30, 2022 | Data PoisoningDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Balancing Value Underestimation and Overestimation with Realistic Actor-Critic | Oct 19, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| On Learning Intrinsic Rewards for Policy Gradient Methods | Apr 17, 2018 | Atari GamesDecision Making | CodeCode Available | 0 | 5 |
| Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies | Mar 14, 2023 | Decision MakingMuJoCo | CodeCode Available | 0 | 5 |
| MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning | Sep 17, 2019 | MuJoCoOpenAI Gym | CodeCode Available | 0 | 5 |
| Decision Transformer under Random Frame Dropping | Mar 3, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Bootstrapping the Expressivity with Model-based Planning | Sep 25, 2019 | modelMuJoCo | CodeCode Available | 0 | 5 |
| A dynamical clipping approach with task feedback for Proximal Policy Optimization | Dec 12, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics | Feb 17, 2024 | MuJoCoRepresentation Learning | CodeCode Available | 0 | 5 |
| BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning | Oct 27, 2019 | Deep Reinforcement LearningImitation Learning | CodeCode Available | 0 | 5 |
| Episodic Curiosity through Reachability | Oct 4, 2018 | MuJoCoReinforcement Learning | CodeCode Available | 0 | 5 |
| An Invariant Information Geometric Method for High-Dimensional Online Optimization | Jan 3, 2024 | Bayesian OptimizationMuJoCo | CodeCode Available | 0 | 5 |
| Mildly Constrained Evaluation Policy for Offline Reinforcement Learning | Jun 6, 2023 | D4RLMuJoCo | CodeCode Available | 0 | 5 |
| Lyapunov-based Safe Policy Optimization for Continuous Control | Jan 28, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| A Generalized Training Approach for Multiagent Learning | Sep 27, 2019 | MuJoCo | CodeCode Available | 0 | 5 |
| Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari | Feb 24, 2018 | Atari GamesBenchmarking | CodeCode Available | 0 | 5 |
| Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards | Dec 26, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain Randomization | Jul 29, 2022 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| LLMs for sensory-motor control: Combining in-context and iterative learning | Jun 5, 2025 | MuJoCo | CodeCode Available | 0 | 5 |
| Online Reinforcement Learning in Non-Stationary Context-Driven Environments | Feb 4, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| On Rollouts in Model-Based Reinforcement Learning | Jan 28, 2025 | modelModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| Exploring Model-based Planning with Policy Networks | Jun 20, 2019 | Benchmarkingmodel | CodeCode Available | 0 | 5 |
| Learning What To Do by Simulating the Past | Apr 8, 2021 | MuJoCo | CodeCode Available | 0 | 5 |
| Learning to Play Cup-and-Ball with Noisy Camera Observations | Jul 19, 2020 | MuJoCo | CodeCode Available | 0 | 5 |
| Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations | Apr 12, 2019 | Deep Reinforcement LearningImitation Learning | CodeCode Available | 0 | 5 |
| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Leveraging exploration in off-policy algorithms via normalizing flows | May 16, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Learning Powerful Policies by Using Consistent Dynamics Model | Jun 11, 2019 | Atari Gamesmodel | CodeCode Available | 0 | 5 |
| Controlled Diversity with Preference : Towards Learning a Diverse Set of Desired Skills | Mar 7, 2023 | DiversityMuJoCo | CodeCode Available | 0 | 5 |
| Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space | May 20, 2024 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Fat-to-Thin Policy Optimization: Offline RL with Sparse Policies | Jan 24, 2025 | MuJoCoOffline RL | CodeCode Available | 0 | 5 |
| CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning | Dec 14, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Feudal Graph Reinforcement Learning | Apr 11, 2023 | Decision MakingGraph Clustering | CodeCode Available | 0 | 5 |
| A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning | Dec 12, 2023 | MuJoCoOffline RL | CodeCode Available | 0 | 5 |
| Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp | Nov 30, 2020 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization | Feb 5, 2023 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| ADDQ: Adaptive Distributional Double Q-Learning | Jun 24, 2025 | Distributional Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| A general class of surrogate functions for stable and efficient reinforcement learning | Aug 12, 2021 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios | Oct 12, 2023 | Board GamesDecision Making | CodeCode Available | 0 | 5 |
| Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation | Mar 27, 2025 | MuJoCoSMAC | CodeCode Available | 0 | 5 |
| Learning Calibratable Policies using Programmatic Style-Consistency | Oct 2, 2019 | Imitation LearningMuJoCo | CodeCode Available | 0 | 5 |
| Formal Language Constraints for Markov Decision Processes | Oct 2, 2019 | Atari GamesMuJoCo | CodeCode Available | 0 | 5 |
| Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning | Nov 22, 2018 | Hierarchical Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Continuous Control With Ensemble Deep Deterministic Policy Gradients | Nov 30, 2021 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Asynchronous Methods for Model-Based Reinforcement Learning | Oct 28, 2019 | modelModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| Language as an Abstraction for Hierarchical Deep Reinforcement Learning | Jun 18, 2019 | Deep Reinforcement LearningInstruction Following | CodeCode Available | 0 | 5 |