SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 126150 of 1918 papers

TitleStatusHype
DisCor: Corrective Feedback in Reinforcement Learning via Distribution CorrectionCode1
Gradient Temporal-Difference Learning with Regularized CorrectionsCode1
Learning the Markov Decision Process in the Sparse Gaussian EliminationCode1
Multi-Agent Determinantal Q-LearningCode1
Energy-based Surprise Minimization for Multi-Agent Value FactorizationCode1
A Recipe for Unbounded Data Augmentation in Visual Reinforcement LearningCode1
Evolution Strategies as a Scalable Alternative to Reinforcement LearningCode1
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning TechniquesCode1
Randomized Ensembled Double Q-Learning: Learning Fast Without a ModelCode1
Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous ControlsCode1
A Stochastic Game Framework for Efficient Energy Management in Microgrid NetworksCode1
Addressing Function Approximation Error in Actor-Critic MethodsCode1
Hybrid RL: Using Both Offline and Online Data Can Make RL EfficientCode1
Automated Cloud Provisioning on AWS using Deep Reinforcement LearningCode1
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
IQ-Learn: Inverse soft-Q Learning for ImitationCode1
Is Q-learning Provably Efficient?Code1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Benchmarking Batch Deep Reinforcement Learning AlgorithmsCode1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
MADiff: Offline Multi-agent Learning with Diffusion ModelsCode1
Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19Code1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
Boosting Continuous Control with Consistency PolicyCode1
Uncertainty Weighted Actor-Critic for Offline Reinforcement LearningCode1
Show:102550
← PrevPage 6 of 77Next →

No leaderboard results yet.