SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 751800 of 1918 papers

TitleStatusHype
Goal-Conditioned Q-Learning as Knowledge DistillationCode0
Object Goal Navigation using Data Regularized Q-Learning0
Prospect Theory-inspired Automated P2P Energy Trading with Q-learning-based Dynamic Pricing0
Recurrent Neural Network-based Anti-jamming Framework for Defense Against Multiple Jamming Policies0
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement LearningCode2
A Novel Resource Allocation for Anti-jamming in Cognitive-UAVs: an Active Inference Approach0
Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems0
Reinforcement Learning for Joint V2I Network Selection and Autonomous Driving Policies0
Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling0
Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction ApproachCode0
A Maintenance Planning Framework using Online and Offline Deep Reinforcement Learning0
Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning0
Structural Similarity for Improved Transfer in Reinforcement Learning0
Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View0
On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios0
Multi-Source AoI-Constrained Resource Minimization under HARQ: Heterogeneous Sampling Processes0
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari GamesCode1
DDPG Learning for Aerial RIS-Assisted MU-MISO Communications0
Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing0
Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement LearningCode1
Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman ProblemsCode1
Multi-objective Optimization of Notifications Using Offline Reinforcement Learning0
Planning with RL and episodic-memory behavioral priors0
q-Learning in Continuous Time0
Action-modulated midbrain dopamine activity arises from distributed control policies0
Interactive Learning from Natural Language and Demonstrations using Signal Temporal Logic0
On the Learning and Learnability of QuasimetricsCode1
Predicting the Need for Blood Transfusion in Intensive Care Units with Reinforcement Learning0
Reinforcement Learning under Partial Observability Guided by Learned Environment Models0
Recursive Reinforcement Learning0
Federated Stochastic Approximation under Markov Noise and Heterogeneity: Applications in Reinforcement Learning0
The Integration of Machine Learning into Automated Test Generation: A Systematic Mapping Study0
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay BufferCode1
Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic ExplorationCode1
A Search-Based Testing Approach for Deep Reinforcement Learning AgentsCode1
Visual Radial Basis Q-Network0
RL-GA: A Reinforcement Learning-Based Genetic Algorithm for Electromagnetic Detection Satellite Scheduling Problem0
Cooperation between Independent Market MakersCode0
Mildly Conservative Q-Learning for Offline Reinforcement LearningCode1
An Optimization Method-Assisted Ensemble Deep Reinforcement Learning Algorithm to Solve Unit Commitment Problems0
A Study of Continual Learning Methods for Q-Learning0
DeepTPI: Test Point Insertion with Deep Reinforcement LearningCode0
Introspective Experience Replay: Look Back When SurprisedCode0
Concentration bounds for SSP Q-learning for average cost MDPs0
Balancing Profit, Risk, and Sustainability for Portfolio Management0
DDPG based on multi-scale strokes for financial time series trading strategy0
Offline RL for Natural Language Generation with Implicit Language Q LearningCode2
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning0
CoNSoLe: Convex Neural Symbolic Learning0
Graph Backup: Data Efficient Backup Exploiting Markovian TransitionsCode0
Show:102550
← PrevPage 16 of 39Next →

No leaderboard results yet.