| ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy | Mar 21, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Reinforcement Learning: An Overview | Dec 6, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Reinforcement Learning based Sequential Batch-sampling for Bayesian Optimal Experimental Design | Dec 21, 2021 | Deep Reinforcement LearningExperimental Design | —Unverified | 0 | 0 |
| Reinforcement Learning in Linear Quadratic Deep Structured Teams: Global Convergence of Policy Gradient Methods | Nov 29, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Residual Policy Gradient: A Reward View of KL-regularized Objective | Mar 14, 2025 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Rethinking Deep Policy Gradients via State-Wise Policy Improvement | Oct 19, 2020 | Policy Gradient MethodsValue prediction | —Unverified | 0 | 0 |
| Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate | Mar 1, 2024 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Reward-estimation variance elimination in sequential decision processes | Nov 15, 2018 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| Riemannian stochastic optimization methods avoid strict saddle points | Nov 4, 2023 | Dictionary LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Risk-Sensitive Reinforcement Learning via Policy Gradient Search | Oct 22, 2018 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |