SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 88518900 of 15113 papers

TitleStatusHype
Recurrent World Models Facilitate Policy Evolution0
Recursive Constraints to Prevent Instability in Constrained Reinforcement Learning0
Recursive Least Squares Advantage Actor-Critic Algorithms0
Recursive Reasoning Graph for Multi-Agent Reinforcement Learning0
Recursive Reinforcement Learning0
Recursive Sparse Pseudo-input Gaussian Process SARSA0
Redirection Controller Using Reinforcement Learning0
Rediscovering Affordance: A Reinforcement Learning Perspective0
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?0
Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning0
Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations0
Reducing Bus Bunching with Asynchronous Multi-Agent Reinforcement Learning0
Reducing Conservativeness Oriented Offline Reinforcement Learning0
WD3: Taming the Estimation Bias in Deep Reinforcement Learning0
Reducing Planning Complexity of General Reinforcement Learning with Non-Markovian Abstractions0
Reducing Risk for Assistive Reinforcement Learning Policies with Diffusion Models0
Reducing the Deployment-Time Inference Control Costs of Deep Reinforcement Learning Agents via an Asymmetric Architecture0
Re-examining Routing Networks for Multi-task Learning0
REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback0
Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration0
Refined Continuous Control of DDPG Actors via Parametrised Activation0
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage0
REFINING MONTE CARLO TREE SEARCH AGENTS BY MONTE CARLO TREE SEARCH0
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis0
Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs0
Regioned Episodic Reinforcement Learning0
Region Growing Curriculum Generation for Reinforcement Learning0
Regression with Linear Factored Functions0
Regret Analysis in Deterministic Reinforcement Learning0
Regret Analysis of Certainty Equivalence Policies in Continuous-Time Linear-Quadratic Systems0
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms0
Regret Bounds for Discounted MDPs0
Regret Bounds for Information-Directed Reinforcement Learning0
Regret Bounds for Learning State Representations in Reinforcement Learning0
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents0
Regret Bounds for Reinforcement Learning via Markov Chain Concentration0
Regret Bounds for Reinforcement Learning with Policy Advice0
Regret Bounds for Risk-Sensitive Reinforcement Learning0
Regret-Free Reinforcement Learning for LTL Specifications0
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function0
Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning0
Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability0
Compositional Transfer in Hierarchical Reinforcement Learning0
Regularized Inverse Reinforcement Learning0
Regularize! Don't Mix: Multi-Agent Reinforcement Learning without Explicit Centralized Structures0
Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning0
Regularized Policies are Reward Robust0
Regularized Policy Iteration0
Regularized Q-learning0
Regularizing Action Policies for Smooth Control with Reinforcement Learning0
Show:102550
← PrevPage 178 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified