| Batch Policy Gradient Methods for Improving Neural Conversation Models | Feb 10, 2017 | ChatbotPolicy Gradient Methods | —Unverified | 0 | 0 |
| Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient | Oct 27, 2020 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts | Feb 7, 2020 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization | Oct 19, 2021 | Policy Gradient MethodsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods | Oct 4, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings | Nov 30, 2024 | Bayesian OptimizationPolicy Gradient Methods | —Unverified | 0 | 0 |
| CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization | Oct 1, 2018 | Abstractive Text SummarizationImage Captioning | —Unverified | 0 | 0 |
| Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs | Feb 20, 2021 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Commodities Trading through Deep Policy Gradient Methods | Aug 10, 2023 | Algorithmic TradingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning | Dec 7, 2018 | Distributed ComputingMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |