SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 12511275 of 1918 papers

TitleStatusHype
Learning Time Reduction Using Warm Start Methods for a Reinforcement Learning Based Supervisory Control in Hybrid Electric Vehicle Applications0
Energy Consumption and Battery Aging Minimization Using a Q-learning Strategy for a Battery/Ultracapacitor Electric Vehicle0
Energy and Service-priority aware Trajectory Design for UAV-BSs using Double Q-Learning0
Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control0
An Adiabatic Theorem for Policy Tracking with TD-learning0
Stabilizing Transformer-Based Action Sequence Generation For Q-Learning0
Logistic Q-Learning0
Deep Surrogate Q-Learning for Autonomous Driving0
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality0
Language Inference with Multi-head Automata through Reinforcement Learning0
Learning Dexterous Manipulation from Suboptimal Experts0
A Nesterov's Accelerated quasi-Newton method for Global Routing using Deep Reinforcement Learning0
Model-Based Reinforcement Learning for Type 1Diabetes Blood Glucose Control0
Parameterized Reinforcement Learning for Optical System Optimization0
Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic EnvironmentsCode0
Fictitious play in zero-sum stochastic games0
Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control0
Machine Learning Empowered Trajectory and Passive Beamforming Design in UAV-RIS Wireless Networks0
Finite-Time Analysis for Double Q-learning0
Cross Learning in Deep Q-Networks0
Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space0
Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning0
Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs0
A New Approach for Tactical Decision Making in Lane Changing: Sample Efficient Deep Q Learning with a Safety Feedback Reward0
Show:102550
← PrevPage 51 of 77Next →

No leaderboard results yet.