| Hindsight Trust Region Policy Optimization | Jul 29, 2019 | Atari GamesPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment | Jul 26, 2021 | Deep Reinforcement LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradients | Feb 18, 2019 | Meta-LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradient Methods | Dec 1, 2019 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Evaluating Rewards for Question Generation Models | Feb 28, 2019 | Machine TranslationPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Action-depedent Control Variates for Policy Optimization via Stein's Identity | Oct 30, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement | Oct 22, 2018 | Policy Gradient MethodsQ-Learning | CodeCode Available | 0 | 5 |
| Client Selection for Federated Policy Optimization with Environment Heterogeneity | May 18, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Neural Replicator Dynamics | Jun 1, 2019 | counterfactualDeep Reinforcement Learning | CodeCode Available | 0 | 5 |