SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 52765300 of 15113 papers

TitleStatusHype
Reversible Action Design for Combinatorial Optimization with Reinforcement Learning0
Reversible Action Design for Combinatorial Optimization with ReinforcementLearning0
Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates0
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework0
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning0
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning0
Revisiting Design Choices in Offline Model-Based Reinforcement Learning0
Revisiting Design Choices in Offline Model Based Reinforcement Learning0
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning0
Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach0
Revisiting Peng's Q(λ) for Modern Reinforcement Learning0
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning0
Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous0
Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints0
Revisiting the Master-Slave Architecture in Multi-Agent Deep Reinforcement Learning0
Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning0
Revisiting the Roles of “Text” in Text Games0
Revisiting the Roles of "Text" in Text Games0
Revolutionizing Genomics with Reinforcement Learning Techniques0
REvolve: Reward Evolution with Large Language Models using Human Feedback0
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning0
Reward-Aware Proto-Representations in Reinforcement Learning0
Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning0
Reward Biased Maximum Likelihood Estimation for Reinforcement Learning0
Reward Constrained Interactive Recommendation with Natural Language Feedback0
Show:102550
← PrevPage 212 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified