| OpenSpiel: A Framework for Reinforcement Learning in Games | Aug 26, 2019 | General Reinforcement Learningreinforcement-learning | CodeCode Available | 3 | 5 |
| ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models | Oct 16, 2023 | General Reinforcement LearningGPU | CodeCode Available | 2 | 5 |
| Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning | Mar 31, 2025 | General Reinforcement LearningInstruction Following | CodeCode Available | 2 | 5 |
| Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks | Oct 30, 2024 | General Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design | Oct 4, 2023 | Deep Reinforcement LearningGeneral Reinforcement Learning | CodeCode Available | 1 | 5 |
| Adaptive Rational Activations to Boost Deep Reinforcement Learning | Feb 18, 2021 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 1 | 5 |
| Learning Deformable Object Manipulation from Expert Demonstrations | Jul 20, 2022 | Deformable Object ManipulationGeneral Reinforcement Learning | CodeCode Available | 1 | 5 |
| Learning Exploration Policies for Navigation | Mar 5, 2019 | Efficient ExplorationGeneral Reinforcement Learning | CodeCode Available | 1 | 5 |
| Learning to Incentivize Other Learning Agents | Jun 10, 2020 | General Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 1 | 5 |
| Action Branching Architectures for Deep Reinforcement Learning | Nov 24, 2017 | continuous-controlContinuous Control | CodeCode Available | 1 | 5 |
| Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm | Dec 5, 2017 | Game of ChessGame of Go | CodeCode Available | 1 | 5 |
| Dynamic Algorithm Configuration: Foundation of a New Meta-Algorithmic Framework | Jun 1, 2020 | General Reinforcement Learning | CodeCode Available | 1 | 5 |
| NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning | May 21, 2025 | General Reinforcement LearningLogical Reasoning | CodeCode Available | 1 | 5 |
| Counterfactual Data Augmentation using Locally Factored Dynamics | Jul 6, 2020 | counterfactualData Augmentation | CodeCode Available | 1 | 5 |
| End-to-End Egospheric Spatial Memory | Feb 15, 2021 | General Reinforcement LearningImitation Learning | CodeCode Available | 1 | 5 |
| Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution | Sep 29, 2020 | General Reinforcement LearningMinecraft | CodeCode Available | 1 | 5 |
| Data-Efficient Reinforcement Learning with Self-Predictive Representations | Jul 12, 2020 | Atari Games 100kData Augmentation | CodeCode Available | 1 | 5 |
| Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning | Jun 21, 2020 | FPS GamesGeneral Reinforcement Learning | CodeCode Available | 1 | 5 |
| DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving | Oct 29, 2022 | Autonomous DrivingCARLA MAP Leaderboard | CodeCode Available | 1 | 5 |
| Stabilizing Transformers for Reinforcement Learning | Oct 13, 2019 | General Reinforcement LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors | Jul 15, 2020 | Developmental LearningDrone Controller | CodeCode Available | 1 | 5 |
| Intelligent Resource Allocation in Joint Radar-Communication With Graph Neural Networks | Oct 17, 2022 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 | 5 |
| Intelligent Trading Systems: A Sentiment-Aware Reinforcement Learning Approach | Nov 14, 2021 | Algorithmic TradingGeneral Reinforcement Learning | CodeCode Available | 1 | 5 |
| Time Limits in Reinforcement Learning | Dec 1, 2017 | General Reinforcement Learningreinforcement-learning | CodeCode Available | 1 | 5 |
| Gibson Env: Real-World Perception for Embodied Agents | Aug 31, 2018 | Domain AdaptationGeneral Reinforcement Learning | CodeCode Available | 0 | 5 |
| Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback for Unbiased Learning to Rank | Mar 31, 2022 | counterfactualGeneral Reinforcement Learning | CodeCode Available | 0 | 5 |
| Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation | Sep 1, 2021 | Deep Reinforcement LearningGeneral Reinforcement Learning | CodeCode Available | 0 | 5 |
| A Monte Carlo AIXI Approximation | Sep 4, 2009 | General Reinforcement LearningOpen-Ended Question Answering | CodeCode Available | 0 | 5 |
| The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning | Jul 7, 2020 | General Reinforcement LearningModel-based Reinforcement Learning | CodeCode Available | 0 | 5 |
| AIXIjs: A Software Demo for General Reinforcement Learning | May 22, 2017 | General Reinforcement LearningOpenAI Gym | CodeCode Available | 0 | 5 |
| Interactive Learning from Activity Description | Feb 13, 2021 | General Reinforcement LearningGrounded language learning | CodeCode Available | 0 | 5 |
| Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps | May 18, 2020 | Atari GamesDecision Making | CodeCode Available | 0 | 5 |
| Hypercube Policy Regularization Framework for Offline Reinforcement Learning | Nov 7, 2024 | D4RLGeneral Reinforcement Learning | CodeCode Available | 0 | 5 |
| Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field | Aug 13, 2019 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| QKSA: Quantum Knowledge Seeking Agent | Jul 3, 2021 | Artificial LifeGeneral Reinforcement Learning | CodeCode Available | 0 | 5 |
| Learning to Represent Action Values as a Hypergraph on the Action Vertices | Oct 28, 2020 | Atari GamesContinuous Control | CodeCode Available | 0 | 5 |
| Learning to Backdoor Federated Learning | Mar 6, 2023 | Backdoor AttackFederated Learning | CodeCode Available | 0 | 5 |
| Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning | Jun 3, 2019 | General Reinforcement Learningreinforcement-learning | CodeCode Available | 0 | 5 |
| Dex: Incremental Learning for Complex Environments in Deep Reinforcement Learning | Jun 19, 2017 | Continual LearningDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| Generalised Discount Functions applied to a Monte-Carlo AImu Implementation | Mar 3, 2017 | General Reinforcement Learningreinforcement-learning | CodeCode Available | 0 | 5 |
| The Problem of Social Cost in Multi-Agent General Reinforcement Learning: Survey and Synthesis | Dec 3, 2024 | General Reinforcement LearningMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| The Sample-Complexity of General Reinforcement Learning | Aug 22, 2013 | General Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Towards More Efficient, Robust, Instance-adaptive, and Generalizable Sequential Decision making | Apr 12, 2025 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 | 0 |
| Transferring Agent Behaviors from Videos via Motion GANs | Nov 21, 2017 | General Reinforcement LearningGenerative Adversarial Network | —Unverified | 0 | 0 |
| Variational Regret Bounds for Reinforcement Learning | May 14, 2019 | General Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| Abstractions of General Reinforcement Learning | Dec 26, 2021 | General Reinforcement Learningreinforcement-learning | —Unverified | 0 | 0 |
| AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning | Nov 28, 2022 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via Deep Reinforcement Learning | Dec 31, 2022 | Deep Reinforcement LearningEdge-computing | —Unverified | 0 | 0 |
| Active Information Acquisition | Feb 5, 2016 | General Reinforcement LearningReinforcement Learning | —Unverified | 0 | 0 |
| A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning | Aug 29, 2021 | Deep Reinforcement LearningGeneral Reinforcement Learning | —Unverified | 0 | 0 |