SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 101125 of 1918 papers

TitleStatusHype
ModelicaGym: Applying Reinforcement Learning to Modelica ModelsCode1
Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori KnowledgeCode1
Multi-Agent Determinantal Q-LearningCode1
Multi-Agent Reinforcement Learning via Distributed MPC as a Function ApproximatorCode1
Negative Update Intervals in Deep Multi-Agent Reinforcement LearningCode1
Neural Interactive Collaborative FilteringCode1
A Recipe for Unbounded Data Augmentation in Visual Reinforcement LearningCode1
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value RegularizationCode1
Optimal Market Making by Reinforcement LearningCode1
Optimistic Exploration even with a Pessimistic InitialisationCode1
Adaptive Contention Window Design using Deep Q-learningCode1
Playing Atari with Deep Reinforcement LearningCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of TrialsCode1
QPLEX: Duplex Dueling Multi-Agent Q-LearningCode1
Towards Universal and Black-Box Query-Response Only Attack on LLMs with QROACode1
Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement LearningCode1
Reasoning with Latent Diffusion in Offline Reinforcement LearningCode1
Research on Robot Path Planning Based on Reinforcement LearningCode1
A Stochastic Game Framework for Efficient Energy Management in Microgrid NetworksCode1
Benchmarking Batch Deep Reinforcement Learning AlgorithmsCode1
Reward Machines for Cooperative Multi-Agent Reinforcement LearningCode1
Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via DiscretisationCode1
Robust Q-learning Algorithm for Markov Decision Processes under Wasserstein UncertaintyCode1
DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-LearningCode1
Show:102550
← PrevPage 5 of 77Next →

No leaderboard results yet.