SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1180111850 of 15113 papers

TitleStatusHype
Improving Sample Efficiency in Model-Free Reinforcement Learning from ImagesCode1
AI Assisted Annotator using Reinforcement Learning0
Deep Reinforcement Learning for Single-Shot Diagnosis and Adaptation in Damaged Robots0
CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem0
Generating Paraphrases with Lean Vocabulary0
Fair Loss: Margin-Aware Reinforcement Learning for Deep Face Recognition0
SME-Net: Sparse Motion Estimation for Parametric Video Prediction Through Reinforcement LearningCode0
Deep Reinforcement Active Learning for Human-in-the-Loop Person Re-Identification0
Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping0
Reinforcement Learning for Multi-Objective Optimization of Online Decisions in High-Dimensional Systems0
Machine Translation for Machines: the Sentiment Classification Use Case0
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement LearningCode1
Generalization in Generation: A closer look at Exposure Bias0
Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving0
End-to-End Motion Planning of Quadrotors Using Deep Reinforcement Learning0
MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning0
RLCache: Automated Cache Management Using Reinforcement Learning0
Multiagent Rollout Algorithms and Reinforcement LearningCode0
Tensor-based Cooperative Control for Large Scale Multi-intersection Traffic Signal Using Deep Reinforcement Learning and Imitation Learning0
Relational Graph Learning for Crowd NavigationCode0
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental DynamicsCode0
Accelerating the Computation of UCB and Related Indices for Reinforcement Learning0
Deep Reinforcement Learning Based Power control for Wireless Multicast Systems0
Adaptive ROI Generation for Video Object Segmentation Using Reinforcement LearningCode0
Deep Coordination GraphsCode0
Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals0
Playing Atari Ball Games with Hierarchical Reinforcement Learning0
Safe Reinforcement Learning on Autonomous Vehicles0
SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning0
LIMIS: Locally Interpretable Modeling using Instance-wise Subsampling0
CAQL: Continuous Action Q-Learning0
MERL: Multi-Head Reinforcement Learning0
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Scaling data-driven robotics with reward sketching and batch reinforcement learning0
Visual Exploration and Energy-aware Path Planning via Reinforcement LearningCode0
Harnessing Structures for Value-Based Planning and Reinforcement LearningCode0
Relationship Explainable Multi-objective Reinforcement Learning with Semantic Explainability Generation0
Towards a Metric for Automated Conversational Dialogue System Evaluation and Improvement0
Solving single-objective tasks by preference multi-objective reinforcement learning0
QXplore: Q-Learning Exploration by Maximizing Temporal Difference Error0
Self-Supervised State-Control through Intrinsic Mutual Information RewardsCode0
Modeling Fake News in Social Networks with Deep Multi-Agent Reinforcement Learning0
Multi-Agent Hierarchical Reinforcement Learning for Humanoid Navigation0
Multiagent Reinforcement Learning in Games with an Iterated Dominance Solution0
Partial Simulation for Imitation Learning0
Long-term planning, short-term adjustments0
Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning0
Meta Learning via Learned Loss0
Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients0
Policy Optimization by Local Improvement through Search0
Show:102550
← PrevPage 237 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified