SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1400114050 of 15113 papers

TitleStatusHype
Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication0
Deep Reinforcement Fuzzing0
Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis0
Autonomous Driving in Reality with Reinforcement Learning and Image Translation0
Expected Policy Gradients for Reinforcement Learning0
DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic NavigationCode0
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutesCode0
Trading the Twitter Sentiment with Reinforcement Learning0
Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors0
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal DemonstrationsCode0
Using reinforcement learning to learn how to play text-based gamesCode0
Faster Deep Q-learning using Neural Episodic Control0
Deep Reinforcement Learning based Optimal Control of Hot Water Systems0
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic ActorCode1
Jointly Learning to Construct and Control Agents using Deep Reinforcement LearningCode0
DeepMind Control SuiteCode1
Learning objects from pixels0
AUTOMATA GUIDED HIERARCHICAL REINFORCEMENT LEARNING FOR ZERO-SHOT SKILL COMPOSITION0
Latent forward model for Real-time Strategy game planning with incomplete information0
Faster Reinforcement Learning with Expert State Sequences0
Learning to Treat Sepsis with Multi-Output Gaussian Process Deep Recurrent Q-Networks0
Learning Dynamic State Abstractions for Model-Based Reinforcement Learning0
Learning Gaussian Policies from Smoothed Action Value Functions0
Domain Adaptation for Deep Reinforcement Learning in Visually Distinct Games0
Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design0
Combination of Supervised and Reinforcement Learning For Vision-Based Autonomous Control0
Alpha-divergence bridges maximum likelihood and reinforcement learning in neural sequence generation0
A dynamic game approach to training robust deep policies0
Autonomous Vehicle Fleet Coordination With Deep Reinforcement Learning0
Reward Estimation via State Prediction0
Neuron as an Agent0
Using Deep Reinforcement Learning to Generate Rationales for Molecules0
Long Term Memory Network for Combinatorial Optimization Problems0
Policy Gradient For Multidimensional Action Spaces: Action Sampling and Entropy Bonus0
LSD-Net: Look, Step and Detect for Joint Navigation and Multi-View Recognition with Deep Reinforcement Learning0
Predicting Multiple Actions for Stochastic Continuous Control0
Neural Task Graph Execution0
Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients0
Model-based imitation learning from state trajectories0
Now I Remember! Episodic Memory For Reinforcement Learning0
Reinforcement Learning via Replica Stacking of Quantum Measurements for the Training of Quantum Boltzmann Machines0
Universal Agent for Disentangling Environments and Tasks0
Residual Loss Prediction: Reinforcement Learning With No Incremental FeedbackCode0
NerveNet: Learning Structured Policy with Graph Neural NetworksCode0
LatentPoison -- Adversarial Attacks On The Latent Space0
Do Deep Reinforcement Learning Algorithms really Learn to Navigate?0
Learning Robust Rewards with Adverserial Inverse Reinforcement Learning0
Learning an Embedding Space for Transferable Robot Skills0
A Hierarchical Model for Device Placement0
Action-dependent Control Variates for Policy Optimization via Stein Identity0
Show:102550
← PrevPage 281 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified