| Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization | Jun 20, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 |
| Efficient Diffusion Policies for Offline Reinforcement Learning | May 31, 2023 | D4RLOffline RL | CodeCode Available | 1 |
| Policy Gradient Methods in the Presence of Symmetries and State Abstractions | May 9, 2023 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Online Portfolio Management via Deep Reinforcement Learning with High-Frequency Data | May 1, 2023 | Deep Reinforcement LearningManagement | CodeCode Available | 1 |
| Partial advantage estimator for proximal policy optimization | Jan 26, 2023 | MuJoCoPolicy Gradient Methods | CodeCode Available | 1 |
| Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization | Oct 3, 2022 | Decision MakingPolicy Gradient Methods | CodeCode Available | 1 |
| Continuous MDP Homomorphisms and Homomorphic Policy Gradient | Sep 15, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning | Jul 12, 2022 | Lifelong learningPolicy Gradient Methods | CodeCode Available | 1 |
| The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure | May 20, 2022 | Efficient ExplorationPolicy Gradient Methods | CodeCode Available | 1 |
| Episodic Policy Gradient Training | Dec 3, 2021 | Policy Gradient MethodsScheduling | CodeCode Available | 1 |