| Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning | Jun 1, 2020 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 1 | 5 |
| Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization | Oct 3, 2022 | Decision MakingPolicy Gradient Methods | CodeCode Available | 1 | 5 |
| Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning | Nov 4, 2018 | DecoderMulti-agent Reinforcement Learning | CodeCode Available | 1 | 5 |
| Learning Opinion Summarizers by Selecting Informative Reviews | Sep 9, 2021 | Few-Shot LearningOpinion Summarization | CodeCode Available | 1 | 5 |
| Competitive Policy Optimization | Jun 18, 2020 | Policy Gradient Methods | CodeCode Available | 1 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradient Methods | Dec 1, 2019 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Fast Efficient Hyperparameter Tuning for Policy Gradients | Feb 18, 2019 | Meta-LearningPolicy Gradient Methods | CodeCode Available | 0 | 5 |
| Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models | Jul 16, 2023 | Policy Gradient Methods | CodeCode Available | 0 | 5 |
| Action-depedent Control Variates for Policy Optimization via Stein's Identity | Oct 30, 2017 | Policy Gradient Methodsreinforcement-learning | CodeCode Available | 0 | 5 |
| Evaluating Rewards for Question Generation Models | Feb 28, 2019 | Machine TranslationPolicy Gradient Methods | CodeCode Available | 0 | 5 |