SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 826850 of 1918 papers

TitleStatusHype
Convert Language Model into a Value-based Strategic Planner0
Harnessing Deep Q-Learning for Enhanced Statistical Arbitrage in High-Frequency Trading: A Comprehensive Exploration0
Deep Robot Sketching: An application of Deep Q-Learning Networks for human-like sketching0
HAVER: Instance-Dependent Error Bounds for Maximum Mean Estimation and Applications to Q-Learning and Monte Carlo Tree Search0
Hedging of Financial Derivative Contracts via Monte Carlo Tree Search0
Hedging using reinforcement learning: Contextual k-Armed Bandit versus Q-learning0
Cooperation and Reputation Dynamics with Reinforcement Learning0
Hidden Incentives for Auto-Induced Distributional Shift0
Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process0
Hierarchical clustering with deep Q-learning0
Cooperative Control of Mobile Robots with Stackelberg Learning0
Hierarchical Deep Q-Learning Based Handover in Wireless Networks with Dual Connectivity0
Hierarchical Modular Reinforcement Learning Method and Knowledge Acquisition of State-Action Rule for Multi-target Problem0
Cooperative Optimal Output Tracking for Discrete-Time Multiagent Systems: Stabilizing Policy Iteration Frameworks and Analysis0
High dimensional precision medicine from patient-derived xenografts0
High-Dimensional Stock Portfolio Trading with Deep Reinforcement Learning0
Highway Reinforcement Learning0
Hippocampal representations emerge when training recurrent neural networks on a memory dependent maze navigation task0
How to discretize continuous state-action spaces in Q-learning: A symbolic control approach0
Human and Multi-Agent collaboration in a human-MARL teaming framework0
Hybridizing the 1/5-th Success Rule with Q-Learning for Controlling the Mutation Rate of an Evolutionary Algorithm0
Hybrid LLM-DDQN based Joint Optimization of V2I Communication and Autonomous Driving0
Hybrid Policies Using Inverse Rewards for Reinforcement Learning0
Hybrid Q-Learning Applied to Ubiquitous recommender system0
A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts0
Show:102550
← PrevPage 34 of 77Next →

No leaderboard results yet.