SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 89518975 of 15113 papers

TitleStatusHype
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs0
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning0
Deep Reinforcement Learning in Quantitative Algorithmic Trading: A ReviewCode0
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning0
Reducing the Deployment-Time Inference Control Costs of Deep Reinforcement Learning Agents via an Asymmetric Architecture0
Shaped Policy Search for Evolutionary Strategies using Waypoints0
On the Theory of Reinforcement Learning with Once-per-Episode Feedback0
Predictive Representation Learning for Language Modeling0
Gradient-Free Neural Network Training via Synaptic-Level Reinforcement Learning0
A Survey of Deep Reinforcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles0
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm0
Reconfigurable Intelligent Surface-assisted Multi-UAV Networks: Efficient Resource Allocation with Deep Reinforcement Learning0
Learning Approximate and Exact Numeral Systems via Reinforcement Learning0
A nearly Blackwell-optimal policy gradient methodCode0
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model0
Risk-Aware Transfer in Reinforcement Learning using Successor Features0
Task-Guided Inverse Reinforcement Learning Under Partial Information0
Reinforcement Learning reveals fundamental limits on the mixing of active particles0
Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications0
Stochastic Intervention for Causal Inference via Reinforcement Learning0
Reinforcement Learning for on-line Sequence Transformation0
Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence Optimization0
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching0
Branching Dueling Q-Network Based Online Scheduling of a Microgrid With Distributed Energy Storage Systems0
Adversarial Intrinsic Motivation for Reinforcement LearningCode0
Show:102550
← PrevPage 359 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified