SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 15011550 of 1918 papers

TitleStatusHype
ModelicaGym: Applying Reinforcement Learning to Modelica ModelsCode1
Split Deep Q-Learning for Robust Object Singulation0
ISL: A novel approach for deep explorationCode0
Joint Inference of Reward Machines and Policies for Reinforcement Learning0
SQLR: Short-Term Memory Q-Learning for Elastic Provisioning0
Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders0
Q-learning Assisted Energy-Aware Traffic Offloading and Cell Switching in Heterogeneous Networks0
Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning0
A Deep Learning Approach to Grasping the InvisibleCode0
Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks0
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation0
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning0
Self-driving scale car trained by Deep reinforcement learning0
Multi Pseudo Q-learning Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles0
Deep Reinforcement Learning for Control of Probabilistic Boolean NetworksCode0
Gradient Q(σ, λ): A Unified Algorithm with Function Approximation for Reinforcement Learning0
Encoders and Decoders for Quantum Expander Codes Using Machine Learning0
Q-DATA: Enhanced Traffic Flow Monitoring in Software-Defined Networks applying Q-learning0
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorchCode2
Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithmsCode0
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity0
STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control0
Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning0
Deep Reinforcement Learning for Foreign Exchange Trading0
Performing Deep Recurrent Double Q-Learning for Atari GamesCode0
Learn How to Cook a New Recipe in a New House: Using Map Familiarization, Curriculum Learning, and Bandit Feedback to Learn Families of Text-Based Adventure GamesCode0
Large-Scale Traffic Signal Control Using a Novel Multi-Agent Reinforcement Learning0
Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging Agents0
Control of nonlinear, complex and black-boxed greenhouse system with reinforcement learningCode0
Q-MIND: Defeating Stealthy DoS Attacks in SDN with a Machine-learning based Defense Framework0
Towards Model-based Reinforcement Learning for Industry-near EnvironmentsCode0
Potential-Based Advice for Stochastic Policy Learning0
Photonic architecture for reinforcement learning0
Model-free Control of Chaos with Continuous Deep Q-learning0
An Optimistic Perspective on Offline Reinforcement LearningCode1
An intelligent financial portfolio trading strategy using deep Q-learningCode0
Q-learning pour la r\'esolution des anaphores pronominales en langue arabe (Q-learning for pronominal anaphora resolution in Arabic texts)0
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
Q-Learning Inspired Self-Tuning for Energy Efficiency in HPC0
Towards Empathic Deep Q-LearningCode0
Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals0
Optimal Use of Experience in First Person Shooter Environments0
In Hindsight: A Smooth Reward for Steady Exploration0
Neural networks with motivation0
Reinforcement Learning-Based Trajectory Design for the Aerial Base Stations0
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and NeuropsychiatryCode1
Split Q Learning: Reinforcement Learning with Two-Stream RewardsCode1
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach0
Reward Prediction Error as an Exploration Objective in Deep RL0
Show:102550
← PrevPage 31 of 39Next →

No leaderboard results yet.