| Lyapunov-based Safe Policy Optimization for Continuous Control | Jan 28, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| LLMs for sensory-motor control: Combining in-context and iterative learning | Jun 5, 2025 | MuJoCo | CodeCode Available | 0 | 5 |
| Online Reinforcement Learning in Non-Stationary Context-Driven Environments | Feb 4, 2023 | MuJoCoreinforcement-learning | CodeCode Available | 0 | 5 |
| MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning | Sep 17, 2019 | MuJoCoOpenAI Gym | CodeCode Available | 0 | 5 |
| Off-Policy Average Reward Actor-Critic with Deterministic Policy Search | May 20, 2023 | MuJoCo | CodeCode Available | 0 | 5 |
| Learning What To Do by Simulating the Past | Apr 8, 2021 | MuJoCo | CodeCode Available | 0 | 5 |
| Learning to Play Cup-and-Ball with Noisy Camera Observations | Jul 19, 2020 | MuJoCo | CodeCode Available | 0 | 5 |
| Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning | Jun 24, 2025 | Meta Reinforcement LearningMuJoCo | CodeCode Available | 0 | 5 |
| Leveraging exploration in off-policy algorithms via normalizing flows | May 16, 2019 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |
| Bayesian Policy Gradients via Alpha Divergence Dropout Inference | Dec 6, 2017 | continuous-controlContinuous Control | CodeCode Available | 0 | 5 |