SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 95519600 of 15113 papers

TitleStatusHype
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement LearningCode1
Game-Theoretic Multiagent Reinforcement LearningCode1
Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural NetworksCode1
FireCommander: An Interactive, Probabilistic Multi-agent Environment for Heterogeneous Robot TeamsCode1
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement LearningCode1
POMO: Policy Optimization with Multiple Optima for Reinforcement LearningCode1
Topic-Preserving Synthetic News Generation: An Adversarial Deep Reinforcement Learning Approach0
Personalized Multimorbidity Management for Patients with Type 2 Diabetes Using Reinforcement Learning of Electronic Health RecordsCode0
Recovery RL: Safe Reinforcement Learning with Learned Recovery ZonesCode1
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement LearningCode1
Machine versus Human Attention in Deep Reinforcement Learning Tasks0
Reinforcement Learning of Causal Variables Using Mediation Analysis0
How do Offline Measures for Exploration in Reinforcement Learning behave?0
Learning Personalized Discretionary Lane-Change Initiation for Fully Autonomous Driving Based on Reinforcement Learning0
Abstract Value Iteration for Hierarchical Reinforcement Learning0
DeepFoldit -- A Deep Reinforcement Learning Neural Network Folding Proteins0
Learning to Unknot0
Learning to Represent Action Values as a Hypergraph on the Action Vertices0
Designing Interpretable Approximations to Deep Reinforcement Learning0
Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning0
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment0
Multi-Agent Safe Policy Learning for Power Management of Networked Microgrids0
Learning to be Safe: Deep RL with a Safety Critic0
Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient0
Learning Time Reduction Using Warm Start Methods for a Reinforcement Learning Based Supervisory Control in Hybrid Electric Vehicle Applications0
Behavior Priors for Efficient Reinforcement Learning0
COG: Connecting New Skills to Past Experience with Offline Reinforcement LearningCode1
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement LearningCode1
Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?0
Learning Financial Asset-Specific Trading Rules via Deep Reinforcement LearningCode1
Affordance as general value function: A computational model0
RH-Net: Improving Neural Relation Extraction via Reinforcement Learning and Hierarchical Relational SearchingCode0
Conservative Safety Critics for Exploration0
Succinct and Robust Multi-Agent Communication With Temporal Message ControlCode1
Pairwise heuristic sequence alignment algorithm based on deep reinforcement learning0
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning0
Lyapunov-Based Reinforcement Learning State Estimator0
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement LearningCode1
VisualHints: A Visual-Lingual Environment for Multimodal Reinforcement Learning0
MELD: Meta-Reinforcement Learning from Images via Latent State ModelsCode1
Personalised Meta-path Generation for Heterogeneous GNNsCode1
Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision0
Forethought and Hindsight in Credit Assignment0
High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards0
Contextual Latent-Movements Off-Policy Optimization for Robotic Manipulation Skills0
Behavioral decision-making for urban autonomous driving in the presence of pedestrians using Deep Recurrent Q-Network0
Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control0
How to Make Deep RL Work in PracticeCode0
Adaptive Federated Learning and Digital Twin for Industrial Internet of Things0
Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search0
Show:102550
← PrevPage 192 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified