SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 751800 of 1918 papers

TitleStatusHype
Fictitious play in zero-sum stochastic games0
Fidelity-based Probabilistic Q-learning for Control of Quantum Systems0
Final Adaptation Reinforcement Learning for N-Player Games0
Finding the best design parameters for optical nanostructures using reinforcement learning0
Finite Horizon Q-learning: Stability, Convergence, Simulations and an application on Smart Grids0
Finite-Sample Analysis for SARSA with Linear Function Approximation0
Finite Sample Analysis of Average-Reward TD Learning and Q-Learning0
Finite-Sample Analysis of Decentralized Q-Learning for Stochastic Games0
Balancing Profit, Risk, and Sustainability for Portfolio Management0
Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation0
Finite-Time Analysis for Double Q-learning0
Finite-Time Analysis of Asynchronous Stochastic Approximation and Q-Learning0
A Discrete-Time Switching System Analysis of Q-learning0
Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View0
Deep Transfer Q-Learning for Offline Non-Stationary Reinforcement Learning0
Finite-Time Convergence Rates of Decentralized Stochastic Approximation with Applications in Multi-Agent and Multi-Task Learning0
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach0
CoNSoLe: Convex Neural Symbolic Learning0
Finite-Time Analysis of Simultaneous Double Q-learning0
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation0
Finite-Time Bounds for Two-Time-Scale Stochastic Approximation with Arbitrary Norm Contractions and Markovian Noise0
Finite-Time Error Analysis of Online Model-Based Q-Learning with a Relaxed Sampling Model0
Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach0
FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations0
Fire Threat Detection From Videos with Q-Rough Sets0
Fitted Q-Learning for Relational Domains0
Learning in Discounted-cost and Average-cost Mean-field Games0
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning0
Balancing a CartPole System with Reinforcement Learning -- A Tutorial0
ShiQ: Bringing back Bellman to LLMs0
Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals0
FM3Q: Factorized Multi-Agent MiniMax Q-Learning for Two-Team Zero-Sum Markov Game0
Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets0
Deep Surrogate Q-Learning for Autonomous Driving0
FRAC-Q-Learning: A Reinforcement Learning with Boredom Avoidance Processes for Social Robots0
Almost Sure Convergence Rates and Concentration of Stochastic Approximation and Reinforcement Learning with Markovian Noise0
From r to Q^*: Your Language Model is Secretly a Q-Function0
Continuous Deep Q-Learning in Optimal Control Problems: Normalized Advantage Functions Analysis0
Harnessing Deep Q-Learning for Enhanced Statistical Arbitrage in High-Frequency Trading: A Comprehensive Exploration0
Full Gradient Deep Reinforcement Learning for Average-Reward Criterion0
Functional Stability of Discounted Markov Decision Processes Using Economic MPC Dissipativity Theory0
HAVER: Instance-Dependent Error Bounds for Maximum Mean Estimation and Applications to Q-Learning and Monte Carlo Tree Search0
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy0
Gap-Dependent Bounds for Federated Q-learning0
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition0
Gap-Dependent Bounds for Two-Player Markov Games0
GenCos' Behaviors Modeling Based on Q Learning Improved by Dichotomy0
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty0
Hidden Incentives for Auto-Induced Distributional Shift0
Deep Spectral Q-learning with Application to Mobile Health0
Show:102550
← PrevPage 16 of 39Next →

No leaderboard results yet.