SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 15511600 of 1918 papers

TitleStatusHype
A Generalized Minimax Q-learning Algorithm for Two-Player Zero-Sum Stochastic Games0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Variance-reduced Q-learning is minimax optimal0
Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing0
"Did You Hear That?" Learning to Play Video Games from Audio Cues0
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
Escaping the State of Nature: A Hobbesian Approach to Cooperation in Multi-agent Reinforcement Learning0
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning0
Risk-Sensitive Compact Decision Trees for Autonomous Execution in Presence of Simulated Market Response0
Deep Q-Learning for Directed Acyclic Graph Generation0
On-board Deep Q-Network for UAV-assisted Online Power Transfer and Data Collection0
Reinforcement Learning with Low-Complexity Liquid State MachinesCode0
Stabilizing Off-Policy Q-Learning via Bootstrapping Error ReductionCode0
Feature-Based Q-Learning for Two-Player Stochastic Games0
RSS-Based Q-Learning for Indoor UAV Navigation0
Provably Efficient Q-Learning with Low Switching Cost0
Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning0
Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology0
A General Markov Decision Process Framework for Directly Learning Optimal Control Policies0
Solving NP-Hard Problems on Graphs with Extended AlphaGo ZeroCode0
Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement LearningCode0
SQIL: Imitation Learning via Reinforcement Learning with Sparse RewardsCode1
Prioritized Sequence Experience Replay0
A Kernel Loss for Solving the Bellman EquationCode0
MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning0
Neural Temporal-Difference and Q-Learning Provably Converge to Global OptimaCode0
Adaptive Symmetric Reward Noising for Reinforcement LearningCode0
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment0
Stochastic Variance Reduction for Deep Q-learning0
Deep Reinforcement Learning Based Parameter Control in Differential EvolutionCode0
Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial0
QBSO-FS: A Reinforcement Learning Based Bee Swarm Optimization Metaheuristic for Feature SelectionCode0
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction0
Autonomous Penetration Testing using Reinforcement Learning0
Stochastic approximation with cone-contractive operators: Sharp _-bounds for Q-learningCode0
Domain Adversarial Reinforcement Learning for Partial Domain Adaptation0
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning0
Pretrain Soft Q-Learning with Imperfect Demonstrations0
A Reinforcement Learning Perspective on the Optimal Control of Mutation Probabilities for the (1+1) Evolutionary Algorithm: First Results on the OneMax Problem0
Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning0
Accelerated Target Updates for Q-learning0
Comprehensible Context-driven Text Game PlayingCode0
Deep Ordinal Reinforcement LearningCode0
Efficient Model-free Reinforcement Learning in Metric SpacesCode0
Learning agents with prioritization and parameter noise in continuous state and action space0
Two-Timescale Networks for Nonlinear Value Function Approximation0
Soft Q-Learning with Mutual-Information Regularization0
A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks0
Zap Q-Learning for Optimal Stopping Time Problems0
Target-Based Temporal Difference Learning0
Show:102550
← PrevPage 32 of 39Next →

No leaderboard results yet.