SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 926950 of 1918 papers

TitleStatusHype
Multi-Agent Advisor Q-LearningCode0
Automating Control of Overestimation Bias for Reinforcement Learning0
Can Q-Learning be Improved with Advice?0
Deep Reinforcement Learning for Simultaneous Sensing and Channel Access in Cognitive Networks0
A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow0
Can Q-learning solve Multi Armed Bantids?0
Playing 2048 With Reinforcement LearningCode0
Balancing Value Underestimation and Overestimation with Realistic Actor-CriticCode0
A Q-Learning-based Approach for Distributed Beam Scheduling in mmWave Networks0
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs0
Value Penalized Q-Learning for Recommender Systems0
Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication0
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games0
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning0
Offline Reinforcement Learning with Implicit Q-LearningCode1
Fast Block Linear System Solver Using Q-Learning Schduling for Unified Dynamic Power System Simulations0
Urban traffic dynamic rerouting framework: A DRL-based model with fog-cloud architecture0
Navigation In Urban Environments Amongst Pedestrians Using Multi-Objective Deep Reinforcement Learning0
A Deep Learning Inference Scheme Based on Pipelined Matrix Multiplication Acceleration Design and Non-uniform Quantization0
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning0
Training Transition Policies via Distribution Matching for Complex TasksCode0
Compositional Q-learning for electrolyte repletion with imbalanced patient sub-populations0
A study of first-passage time minimization via Q-learning in heated gridworlds0
A Deep Reinforcement Learning Framework for Contention-Based Spectrum Sharing0
Dropout Q-Functions for Doubly Efficient Reinforcement LearningCode1
Show:102550
← PrevPage 38 of 77Next →

No leaderboard results yet.