SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 17261750 of 1918 papers

TitleStatusHype
Meta-Value Learning: a General Framework for Learning with Learning AwarenessCode0
Adversarial Learning of a Sampler Based on an Unnormalized DistributionCode0
Deep Q-learning: a robust control approachCode0
Deep Ordinal Reinforcement LearningCode0
Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted RegressionCode0
Orchestrated Value Mapping for Reinforcement LearningCode0
Angrier Birds: Bayesian reinforcement learningCode0
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement LearningCode0
Offline Contextual Bandits with Overparameterized ModelsCode0
An Empirical Study of Deep Reinforcement Learning in Continuing TasksCode0
Simulation of Nanorobots with Artificial Intelligence and Reinforcement Learning for Advanced Cancer Cell Detection and TrackingCode0
PairVDN - Pair-wise Decomposed Value FunctionsCode0
Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement LearningCode0
Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic MethodsCode0
Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy ManagementCode0
Variation-resistant Q-learning: Controlling and Utilizing Estimation Bias in Reinforcement Learning for Better PerformanceCode0
Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel SimulationCode0
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy GradientsCode0
RadDQN: a Deep Q Learning-based Architecture for Finding Time-efficient Minimum Radiation Exposure PathwayCode0
Boosting Soft Q-Learning by BoundingCode0
Automaton-Guided Curriculum Generation for Reinforcement Learning AgentsCode0
Variations on the Reinforcement Learning performance of BlackjackCode0
Model-Free Adaptive Optimal Control of Episodic Fixed-Horizon Manufacturing Processes using Reinforcement LearningCode0
Deterministic Implementations for Reproducibility in Deep Reinforcement LearningCode0
Designing Neural Network Architectures using Reinforcement LearningCode0
Show:102550
← PrevPage 70 of 77Next →

No leaderboard results yet.