SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 2650 of 1918 papers

TitleStatusHype
Automatic Reward Shaping from Confounded Offline Data0
ShiQ: Bringing back Bellman to LLMs0
Bias or Optimality? Disentangling Bayesian Inference and Learning Biases in Human Decision-Making0
Convert Language Model into a Value-based Strategic Planner0
Universal Approximation Theorem for Deep Q-Learning via FBSDE System0
A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows0
A critical assessment of reinforcement learning methods for microswimmer navigation in complex flowsCode0
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation0
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making0
Meta-Black-Box-Optimization through Offline Q-function LearningCode0
Universal Approximation Theorem of Deep Q-Networks0
Rank-One Modified Value Iteration0
Q-Learning with Clustered-SMART (cSMART) Data: Examining Moderators in the Construction of Clustered Adaptive Interventions0
Learning Neural Control Barrier Functions from Offline Data with Conservatism0
Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning0
Interactive Double Deep Q-network: Integrating Human Interventions and Evaluative Predictions in Reinforcement Learning of Autonomous Driving0
Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes0
SAPO-RL: Sequential Actuator Placement Optimization for Fuselage Assembly via Reinforcement Learning0
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning0
Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration0
Nash Equilibrium Between Consumer Electronic Devices and DoS Attacker for Distributed IoT-enabled RSE Systems0
A Framework of decision-relevant observability: Reinforcement Learning converges under relative ignorability0
State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements0
OmniEcon Nexus: Global Microeconomic Simulation EngineCode0
Deep Reinforcement Learning Algorithms for Option HedgingCode0
Show:102550
← PrevPage 2 of 77Next →

No leaderboard results yet.