SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 17511800 of 1918 papers

TitleStatusHype
Q-CP: Learning Action Values for Cooperative Planning0
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy MethodsCode0
Variance Reduction Methods for Sublinear Reinforcement Learning0
Addressing Function Approximation Error in Actor-Critic MethodsCode1
Temporal Difference Models: Model-Free Deep RL for Model-Based Control0
Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments0
Efficient Collaborative Multi-Agent Deep Reinforcement Learning for Large-Scale Fleet ManagementCode0
A Deep Q-Learning Agent for the L-Game with Variable Batch TrainingCode0
Monte Carlo Q-learning for General Game PlayingCode0
Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays0
Mean Field Multi-Agent Reinforcement LearningCode1
Q-learning with Nearest Neighbors0
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search0
Balancing Two-Player Stochastic Games with Soft Q-Learning0
Deep Reinforcement Learning using Capsules in Advanced Game Environments0
Using deep Q-learning to understand the tax evasion behavior of risk-averse firmsCode0
The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and Option Portfolios0
Deep Reinforcement Fuzzing0
DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic NavigationCode0
Trading the Twitter Sentiment with Reinforcement Learning0
Faster Deep Q-learning using Neural Episodic Control0
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic ActorCode1
ViZDoom: DRQN with Prioritized Experience Replay, Double-Q Learning, & Snapshot Ensembling0
ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks0
Autonomous Vehicle Fleet Coordination With Deep Reinforcement Learning0
Learning Gaussian Policies from Smoothed Action Value Functions0
Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients0
TD Learning with Constrained Gradients0
Avoiding Catastrophic States with Intrinsic Fear0
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation0
A short variational proof of equivalence between policy gradients and soft Q learning0
Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world0
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement LearningCode0
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking AgentsCode0
Towards a Deep Reinforcement Learning Approach for Tower Line Wars0
QLBS: Q-Learner in the Black-Scholes(-Merton) WorldsCode0
Robust Deep Reinforcement Learning with Adversarial Attacks0
Assumed Density Filtering Q-learningCode0
Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality0
Zap Q-Learning0
Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes0
Curriculum Q-Learning for Visual Vocabulary Acquisition0
A reinforcement learning algorithm for building collaboration in multi-agent systems0
Classification with Costly Features using Deep Reinforcement LearningCode0
Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
A unified decision making framework for supply and demand management in microgrid networks0
Double Q(σ) and Q(σ, λ): Unifying Reinforcement Learning Control Algorithms0
The Effects of Memory Replay in Reinforcement LearningCode0
Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations0
Show:102550
← PrevPage 36 of 39Next →

No leaderboard results yet.