SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 101150 of 1918 papers

TitleStatusHype
Q-learning with Language Model for Edit-based Unsupervised SummarizationCode1
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
Energy-based Surprise Minimization for Multi-Agent Value FactorizationCode1
Deep Active Inference for Partially Observable MDPsCode1
Table2Charts: Recommending Charts by Learning Shared Table RepresentationsCode1
Robust Deep Reinforcement Learning through Adversarial LossCode1
Deep Inverse Q-learning with ConstraintsCode1
QPLEX: Duplex Dueling Multi-Agent Q-LearningCode1
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement LearningCode1
Neural Interactive Collaborative FilteringCode1
Reward Machines for Cooperative Multi-Agent Reinforcement LearningCode1
Gradient Temporal-Difference Learning with Regularized CorrectionsCode1
Image Classification by Reinforcement Learning with Two-State Q-LearningCode1
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement LearningCode1
Semantic Visual Navigation by Watching YouTube VideosCode1
Conservative Q-Learning for Offline Reinforcement LearningCode1
Multi-Agent Determinantal Q-LearningCode1
Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori KnowledgeCode1
Spatial Action Maps for Mobile ManipulationCode1
Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D EnvironmentsCode1
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning TechniquesCode1
DisCor: Corrective Feedback in Reinforcement Learning via Distribution CorrectionCode1
FACMAC: Factored Multi-Agent Centralised Policy GradientsCode1
Optimistic Exploration even with a Pessimistic InitialisationCode1
Maxmin Q-learning: Controlling the Estimation Bias of Q-learningCode1
A Stochastic Game Framework for Efficient Energy Management in Microgrid NetworksCode1
Discriminator Soft Actor Critic without Extrinsic RewardsCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
Benchmarking Batch Deep Reinforcement Learning AlgorithmsCode1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
ModelicaGym: Applying Reinforcement Learning to Modelica ModelsCode1
An Optimistic Perspective on Offline Reinforcement LearningCode1
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and NeuropsychiatryCode1
Split Q Learning: Reinforcement Learning with Two-Stream RewardsCode1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
SQIL: Imitation Learning via Reinforcement Learning with Sparse RewardsCode1
Optimization of Molecules via Deep Reinforcement LearningCode1
Negative Update Intervals in Deep Multi-Agent Reinforcement LearningCode1
Is Q-learning Provably Efficient?Code1
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement LearningCode1
Addressing Function Approximation Error in Actor-Critic MethodsCode1
Mean Field Multi-Agent Reinforcement LearningCode1
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic ActorCode1
Automated Cloud Provisioning on AWS using Deep Reinforcement LearningCode1
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive EnvironmentsCode1
Evolution Strategies as a Scalable Alternative to Reinforcement LearningCode1
Stabilising Experience Replay for Deep Multi-Agent Reinforcement LearningCode1
Continuous Deep Q-Learning with Model-based AccelerationCode1
Multiagent Cooperation and Competition with Deep Reinforcement LearningCode1
Deep Reinforcement Learning with Double Q-learningCode1
Show:102550
← PrevPage 3 of 39Next →

No leaderboard results yet.