SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 95519575 of 15113 papers

TitleStatusHype
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement LearningCode1
Game-Theoretic Multiagent Reinforcement LearningCode1
Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural NetworksCode1
FireCommander: An Interactive, Probabilistic Multi-agent Environment for Heterogeneous Robot TeamsCode1
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement LearningCode1
POMO: Policy Optimization with Multiple Optima for Reinforcement LearningCode1
Topic-Preserving Synthetic News Generation: An Adversarial Deep Reinforcement Learning Approach0
Personalized Multimorbidity Management for Patients with Type 2 Diabetes Using Reinforcement Learning of Electronic Health RecordsCode0
Recovery RL: Safe Reinforcement Learning with Learned Recovery ZonesCode1
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement LearningCode1
Machine versus Human Attention in Deep Reinforcement Learning Tasks0
Reinforcement Learning of Causal Variables Using Mediation Analysis0
How do Offline Measures for Exploration in Reinforcement Learning behave?0
Learning Personalized Discretionary Lane-Change Initiation for Fully Autonomous Driving Based on Reinforcement Learning0
Abstract Value Iteration for Hierarchical Reinforcement Learning0
DeepFoldit -- A Deep Reinforcement Learning Neural Network Folding Proteins0
Learning to Unknot0
Learning to Represent Action Values as a Hypergraph on the Action Vertices0
Designing Interpretable Approximations to Deep Reinforcement Learning0
Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning0
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment0
Multi-Agent Safe Policy Learning for Power Management of Networked Microgrids0
Learning to be Safe: Deep RL with a Safety Critic0
Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient0
Learning Time Reduction Using Warm Start Methods for a Reinforcement Learning Based Supervisory Control in Hybrid Electric Vehicle Applications0
Show:102550
← PrevPage 383 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified