| CDT: Cascading Decision Trees for Explainable Reinforcement Learning | Nov 15, 2020 | Deep Reinforcement LearningExplainable Models | CodeCode Available | 1 | 5 |
| Training a Resilient Q-Network against Observational Interference | Feb 18, 2021 | Causal InferenceDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Learning Multi-Pursuit Evasion for Safe Targeted Navigation of Drones | Apr 7, 2023 | Deep Reinforcement Learning | CodeCode Available | 1 | 5 |
| DPO Meets PPO: Reinforced Token Optimization for RLHF | Apr 29, 2024 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning | Jan 30, 2024 | Causal DiscoveryDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| An actor-critic algorithm with policy gradients to solve the job shop scheduling problem using deep double recurrent agents | Oct 18, 2021 | Deep Reinforcement LearningJob Shop Scheduling | CodeCode Available | 1 | 5 |
| Amortizing intractable inference in diffusion models for vision, language, and control | May 31, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Correlation-aware Cooperative Multigroup Broadcast 360° Video Delivery Network: A Hierarchical Deep Reinforcement Learning Approach | Oct 21, 2020 | Deep Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Cryptocurrency Portfolio Management with Deep Reinforcement Learning | Dec 5, 2016 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Continuous-Time Fitted Value Iteration for Robust Policies | Oct 5, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |