SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 51100 of 1918 papers

TitleStatusHype
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive EnvironmentsCode1
Multi-Agent Collaboration via Reward Attribution DecompositionCode1
Multi-Agent Determinantal Q-LearningCode1
Multi-Agent Reinforcement Learning via Distributed MPC as a Function ApproximatorCode1
Deep Reinforcement Learning with Double Q-learningCode1
Offline Reinforcement Learning with Implicit Q-LearningCode1
Continuous Deep Q-Learning with Model-based AccelerationCode1
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value RegularizationCode1
Optimistic Exploration even with a Pessimistic InitialisationCode1
Optimistic Multi-Agent Policy GradientCode1
PGDQN: Preference-Guided Deep Q-NetworkCode1
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-PerformerCode1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial DetectionCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Addressing Function Approximation Error in Actor-Critic MethodsCode1
Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19Code1
Boosting Continuous Control with Consistency PolicyCode1
Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman ProblemCode1
Conservative Q-Learning for Offline Reinforcement LearningCode1
Continuous control with deep reinforcement learningCode1
Deep Active Inference for Partially Observable MDPsCode1
Deep Inverse Q-learning with ConstraintsCode1
Acting in Delayed Environments with Non-Stationary Markov PoliciesCode1
Deep Recurrent Q-Learning for Partially Observable MDPsCode1
CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement LearningCode1
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari GamesCode1
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
DisCor: Corrective Feedback in Reinforcement Learning via Distribution CorrectionCode1
Automated Cloud Provisioning on AWS using Deep Reinforcement LearningCode1
Dropout Q-Functions for Doubly Efficient Reinforcement LearningCode1
Energy-based Surprise Minimization for Multi-Agent Value FactorizationCode1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
A Recipe for Unbounded Data Augmentation in Visual Reinforcement LearningCode1
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning TechniquesCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
GAIL-PT: A Generic Intelligent Penetration Testing Framework with Generative Adversarial Imitation LearningCode1
HASCO: Towards Agile HArdware and Software CO-design for Tensor ComputationCode1
Hybrid RL: Using Both Offline and Online Data Can Make RL EfficientCode1
Image Classification by Reinforcement Learning with Two-State Q-LearningCode1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Learning the Markov Decision Process in the Sparse Gaussian EliminationCode1
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement LearningCode1
MAN: Multi-Action Networks LearningCode1
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay BufferCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
A Stochastic Game Framework for Efficient Energy Management in Microgrid NetworksCode1
Show:102550
← PrevPage 2 of 39Next →

No leaderboard results yet.