SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 5175 of 1918 papers

TitleStatusHype
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
Image Classification by Reinforcement Learning with Two-State Q-LearningCode1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Adaptive Contention Window Design using Deep Q-learningCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19Code1
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement LearningCode1
Boosting Continuous Control with Consistency PolicyCode1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
MAN: Multi-Action Networks LearningCode1
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay BufferCode1
Deep Inverse Q-learning with ConstraintsCode1
Addressing Function Approximation Error in Actor-Critic MethodsCode1
Continuous control with deep reinforcement learningCode1
Continuous Deep Q-Learning with Model-based AccelerationCode1
FACMAC: Factored Multi-Agent Centralised Policy GradientsCode1
Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman ProblemCode1
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
Deep Active Inference for Partially Observable MDPsCode1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Deep Reinforcement Learning-based Intelligent Traffic Signal Controls with Optimized CO2 emissionsCode1
Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via DiscretisationCode1
DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-LearningCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
Show:102550
← PrevPage 3 of 77Next →

No leaderboard results yet.