SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 12511275 of 1918 papers

TitleStatusHype
MFC-EQ: Mean-Field Control with Envelope Q-Learning for Moving Decentralized Agents in Formation0
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning0
Mimicking Human Intuition: Cognitive Belief-Driven Q-Learning0
Minimax Optimal Q Learning with Nearest Neighbors0
Minimizing Age-of-Information for Fog Computing-supported Vehicular Networks with Deep Q-learning0
Minimizing the Outage Probability in a Markov Decision Process0
Misspecified Q-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error0
Mitigate Bias in Face Recognition using Skewness-Aware Reinforcement Learning0
Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning0
Mitigating Relative Over-Generalization in Multi-Agent Reinforcement Learning0
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning0
Mix Q-learning for Lane Changing: A Collaborative Decision-Making Method in Multi-Agent Deep Reinforcement Learning0
Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks0
Model-Augmented Q-learning0
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping0
Model-Based Reinforcement Learning for Type 1Diabetes Blood Glucose Control0
Model-based versus model-free feeding control and water quality monitoring for fish growth tracking in aquaculture systems0
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints0
Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints0
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games0
Model-Free Characterizations of the Hamilton-Jacobi-Bellman Equation and Convex Q-Learning in Continuous Time0
Model-free Control of Chaos with Continuous Deep Q-learning0
Model-Free Mean-Field Reinforcement Learning: Mean-Field MDP and Mean-Field Q-Learning0
Model-free optimal controller for discrete-time Markovian jump linear systems: A Q-learning approach0
Model-free Posterior Sampling via Learning Rate Randomization0
Show:102550
← PrevPage 51 of 77Next →

No leaderboard results yet.