| A reinterpretation of the policy oscillation phenomenon in approximate policy iteration | Dec 1, 2011 | Policy Gradient MethodsReinforcement Learning | —Unverified | 0 | 0 |
| A Self-Supervised Reinforcement Learning Approach for Fine-Tuning Large Language Models Using Cross-Attention Signals | Feb 14, 2025 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Assumption Questioning: Latent Copying and Reward Exploitation in Question Generation | Sep 27, 2018 | Inductive BiasMachine Translation | —Unverified | 0 | 0 |
| A Study of Policy Gradient on a Class of Exactly Solvable Models | Nov 3, 2020 | Policy Gradient Methods | —Unverified | 0 | 0 |
| Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning | Sep 20, 2022 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Asynchronous Multi-Agent Actor-Critic with Macro-Actions | Sep 29, 2021 | Decision MakingPolicy Gradient Methods | —Unverified | 0 | 0 |
| Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning | Feb 22, 2018 | Multi-agent Reinforcement LearningPolicy Gradient Methods | —Unverified | 0 | 0 |
| Augmented Bayesian Policy Search | Jul 5, 2024 | Bayesian OptimizationLEMMA | —Unverified | 0 | 0 |
| AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING | Sep 25, 2019 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |
| A unified view of entropy-regularized Markov decision processes | May 22, 2017 | Policy Gradient Methodsreinforcement-learning | —Unverified | 0 | 0 |