SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 176200 of 1918 papers

TitleStatusHype
AFU: Actor-Free critic Updates in off-policy RL for continuous controlCode0
A Framework for Automated Cellular Network Tuning with Reinforcement LearningCode0
Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of MindCode0
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement LearningCode0
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic MethodsCode0
Efficient Model-free Reinforcement Learning in Metric SpacesCode0
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy ImprovementCode0
Efficient Sparse-Reward Goal-Conditioned Reinforcement Learning with a High Replay Ratio and RegularizationCode0
DynamicLight: Two-Stage Dynamic Traffic Signal TimingCode0
Efficient Collaborative Multi-Agent Deep Reinforcement Learning for Large-Scale Fleet ManagementCode0
Evolution of cooperation in a bimodal mixture of conditional cooperatorsCode0
DRL4AOI: A DRL Framework for Semantic-aware AOI Segmentation in Location-Based ServicesCode0
A Fairness-Oriented Reinforcement Learning Approach for the Operation and Control of Shared Micromobility ServicesCode0
Adversarial Learning of a Sampler Based on an Unnormalized DistributionCode0
Double Q-PID algorithm for mobile robot controlCode0
Double Successive Over-Relaxation Q-Learning with an Extension to Deep Reinforcement LearningCode0
Dual Ensembled Multiagent Q-Learning with Hypernet RegularizerCode0
Distributed-Training-and-Execution Multi-Agent Reinforcement Learning for Power Control in HetNetCode0
Deterministic Implementations for Reproducibility in Deep Reinforcement LearningCode0
Diagnosing Bottlenecks in Deep Q-learning AlgorithmsCode0
Active inference: demystified and comparedCode0
Active exploration in parameterized reinforcement learningCode0
Designing Neural Network Architectures using Reinforcement LearningCode0
Distributionally Robust Deep Q-LearningCode0
Dynamic control of self-assembly of quasicrystalline structures through reinforcement learningCode0
Show:102550
← PrevPage 8 of 77Next →

No leaderboard results yet.