SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1170111750 of 15113 papers

TitleStatusHype
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach0
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs0
Retrieval-Augmented Reinforcement Learning0
Retrieval of surgical phase transitions using reinforcement learning0
Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning0
Return-Based Contrastive Representation Learning for Reinforcement Learning0
Return-based Scaling: Yet Another Normalisation Trick for Deep RL0
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay0
Revealing Covert Attention by Analyzing Human and Reinforcement Learning Agent Gameplay0
Revealing higher-order neural representations of uncertainty with the Noise Estimation through Reinforcement-based Diffusion (NERD) model0
Revealing the learning process in reinforcement learning agents through attention-oriented metrics0
ReVeal: Self-Evolving Code Agents via Iterative Generation-Verification0
Reverse Curriculum Generation for Reinforcement Learning0
Reversible Action Design for Combinatorial Optimization with Reinforcement Learning0
Reversible Action Design for Combinatorial Optimization with ReinforcementLearning0
Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates0
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework0
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning0
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning0
Revisiting Design Choices in Offline Model-Based Reinforcement Learning0
Revisiting Design Choices in Offline Model Based Reinforcement Learning0
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning0
Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach0
Revisiting Peng's Q(λ) for Modern Reinforcement Learning0
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning0
Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous0
Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints0
Revisiting the Master-Slave Architecture in Multi-Agent Deep Reinforcement Learning0
Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning0
Revisiting the Roles of “Text” in Text Games0
Revisiting the Roles of "Text" in Text Games0
Revolutionizing Genomics with Reinforcement Learning Techniques0
REvolve: Reward Evolution with Large Language Models using Human Feedback0
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning0
Reward-Aware Proto-Representations in Reinforcement Learning0
Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning0
Reward Biased Maximum Likelihood Estimation for Reinforcement Learning0
Reward Constrained Interactive Recommendation with Natural Language Feedback0
Reward Design for Driver Repositioning Using Multi-Agent Reinforcement Learning0
Reward Design in Cooperative Multi-agent Reinforcement Learning for Packet Routing0
Reward-Directed Score-Based Diffusion Models via q-Learning0
Reward Estimation via State Prediction0
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward0
Reward-Free Attacks in Multi-Agent Reinforcement Learning0
Reward-Free Exploration for Reinforcement Learning0
Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation0
Reward-Free Policy Space Compression for Reinforcement Learning0
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes0
Reward Function and Initial Values: Better Choices for Accelerated Goal-Directed Reinforcement Learning0
Reward Function Optimization of a Deep Reinforcement Learning Collision Avoidance System0
Show:102550
← PrevPage 235 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified