SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 79768000 of 15113 papers

TitleStatusHype
Fundamental Limits of Reinforcement Learning in Environment with Endogeneous and Exogeneous Uncertainty0
Deep Reinforcement Learning for Conservation DecisionsCode1
Learning of feature points without additional supervision improves reinforcement learning from imagesCode0
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity0
Residual Reinforcement Learning from Demonstrations0
Randomized Exploration for Reinforcement Learning with General Value Function ApproximationCode1
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning0
Towards Safe Control of Continuum Manipulator Using Shielded Multiagent Reinforcement Learning0
Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning0
On the Power of Multitask Representation Learning in Linear MDP0
Targeted Data Acquisition for Evolving Negotiation Agents0
Efficient (Soft) Q-Learning for Text Generation with Limited Good DataCode1
Poisoning Deep Reinforcement Learning Agents with In-Distribution Triggers0
Training like Playing: A Reinforcement Learning And Knowledge Graph-based framework for building Automatic Consultation System in Medical Field0
User-Guided Personalized Image Aesthetic Assessment based on Deep Reinforcement Learning0
Automatic Document Sketching: Generating Drafts from Analogous Texts0
Learning-Aided Heuristics Design for Storage System0
Learning Intrusion Prevention Policies through Optimal StoppingCode1
Online Sub-Sampling for Reinforcement Learning with General Function Approximation0
Which Mutual-Information Representation Learning Objectives are Sufficient for Control?0
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion0
Representation Learning for Out-of-distribution Generalization in Reinforcement Learning0
Reinforcement Learning as One Big Sequence Modeling ProblemCode1
MASAI: Multi-agent Summative Assessment Improvement for Unsupervised Environment DesignCode0
Tangent Space Least Adaptive Clustering0
Show:102550
← PrevPage 320 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified