| A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving | Nov 5, 2019 | Automated Theorem ProvingDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning | Apr 6, 2022 | Deep Reinforcement Learning | CodeCode Available | 1 | 5 |
| DPO Meets PPO: Reinforced Token Optimization for RLHF | Apr 29, 2024 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| DREAM: Deep Regret minimization with Advantage baselines and Model-free learning | Jun 18, 2020 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| DRLComplex: Reconstruction of protein quaternary structures using deep reinforcement learning | May 26, 2022 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Distributed Resource Allocation with Multi-Agent Deep Reinforcement Learning for 5G-V2V Communication | Oct 11, 2020 | Deep Reinforcement LearningDistributed Optimization | CodeCode Available | 1 | 5 |
| Distributed Two-tier DRL Framework for Cell-Free Network: Association, Beamforming and Power Allocation | Mar 22, 2023 | Deep Reinforcement Learning | CodeCode Available | 1 | 5 |
| A Reinforcement Learning Environment For Job-Shop Scheduling | Apr 8, 2021 | Combinatorial OptimizationDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Action Space Shaping in Deep Reinforcement Learning | Apr 2, 2020 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| A Scalable and Reproducible System-on-Chip Simulation for Reinforcement Learning | Apr 27, 2021 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |