SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 12011225 of 1918 papers

TitleStatusHype
Stabilizing Transformer-Based Action Sequence Generation For Q-Learning0
Deep Surrogate Q-Learning for Autonomous Driving0
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality0
Logistic Q-Learning0
Language Inference with Multi-head Automata through Reinforcement Learning0
Learning Dexterous Manipulation from Suboptimal Experts0
Multi-Agent Collaboration via Reward Attribution DecompositionCode1
A Nesterov's Accelerated quasi-Newton method for Global Routing using Deep Reinforcement Learning0
Model-Based Reinforcement Learning for Type 1Diabetes Blood Glucose Control0
Parameterized Reinforcement Learning for Optical System Optimization0
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
Q-learning with Language Model for Edit-based Unsupervised SummarizationCode1
Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic EnvironmentsCode0
Fictitious play in zero-sum stochastic games0
Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control0
Machine Learning Empowered Trajectory and Passive Beamforming Design in UAV-RIS Wireless Networks0
Cross Learning in Deep Q-Networks0
Finite-Time Analysis for Double Q-learning0
Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning0
Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space0
Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs0
A New Approach for Tactical Decision Making in Lane Changing: Sample Efficient Deep Q Learning with a Safety Feedback Reward0
Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based GamesCode0
Is Q-Learning Provably Efficient? An Extended Analysis0
Show:102550
← PrevPage 49 of 77Next →

No leaderboard results yet.