SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 626650 of 1918 papers

TitleStatusHype
Approximate information state based convergence analysis of recurrent Q-learning0
Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback0
Approximate Global Convergence of Independent Learning in Multi-Agent Systems0
Control-Tutored Reinforcement Learning: an application to the Herding Problem0
Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control0
Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning0
Applying Reinforcement Learning to Option Pricing and Hedging0
Aerial Base Station Positioning and Power Control for Securing Communications: A Deep Q-Network Approach0
Active Inference in Hebbian Learning Networks0
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty0
Continuous-time q-learning for mean-field control problems0
Application of Deep Reinforcement Learning to Payment Fraud0
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy0
Application of Deep Q-Network in Portfolio Management0
Adversarial Agents For Attacking Inaudible Voice Activated Devices0
Continuous Deep Q-Learning in Optimal Control Problems: Normalized Advantage Functions Analysis0
Application of Deep Q Learning with Simulation Results for Elevator Optimization0
APF+: Boosting adaptive-potential function reinforcement learning methods with a W-shaped network for high-dimensional games0
Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement0
Active Finite Reward Automaton Inference and Reinforcement Learning Using Queries and Counterexamples0
Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts0
A Penalized Shared-parameter Algorithm for Estimating Optimal Dynamic Treatment Regimens0
Contextual Conservative Q-Learning for Offline Reinforcement Learning0
Constructing narrative using a generative model and continuous action policies0
An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning0
Show:102550
← PrevPage 26 of 77Next →

No leaderboard results yet.