SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 13761400 of 1918 papers

TitleStatusHype
A General Framework for Learning Mean-Field Games0
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints0
Privacy-Cost Management in Smart Meters Using Deep Reinforcement Learning0
Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle0
Software-Level Accuracy Using Stochastic Computing With Charge-Trap-Flash Based Weight Matrix0
A Multi-Agent Reinforcement Learning Approach For Safe and Efficient Behavior Planning Of Connected Autonomous Vehicles0
Transfer Reinforcement Learning under Unobserved Contextual Information0
Reinforcement Learning Based Cooperative Coded Caching under Dynamic Popularities in Ultra-Dense Networks0
Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning0
Adaptive Structural Hyper-Parameter Configuration by Q-Learning0
Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts0
Deep Reinforcement Learning for FlipIt Security Game0
ConQUR: Mitigating Delusional Bias in Deep Q-learningCode0
Optimistic Exploration even with a Pessimistic InitialisationCode1
Simultaneously Evolving Deep Reinforcement Learning Models using Multifactorial Optimization0
G-Learner and GIRL: Goal Based Wealth Management with Reinforcement Learning0
A Double Q-Learning Approach for Navigation of Aerial Vehicles with Connectivity Constraint0
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning0
Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning0
Periodic Q-Learning0
Anypath Routing Protocol Design via Q-Learning for Underwater Sensor Networks0
UAV Aided Search and Rescue Operation Using Reinforcement Learning0
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity0
Maxmin Q-learning: Controlling the Estimation Bias of Q-learningCode1
Listwise Learning to Rank with Deep Q-Networks0
Show:102550
← PrevPage 56 of 77Next →

No leaderboard results yet.