SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 54265450 of 15113 papers

TitleStatusHype
RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning0
RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN0
RL-PINNs: Reinforcement Learning-Driven Adaptive Sampling for Efficient Training of PINNs0
RL-QN: A Reinforcement Learning Framework for Optimal Control of Queueing Systems0
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression0
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception0
RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment0
RLSS: A Deep Reinforcement Learning Algorithm for Sequential Scene Generation0
RLTP: Reinforcement Learning to Pace for Delayed Impression Modeling in Preloaded Ads0
RL with KL penalties is better viewed as Bayesian inference0
RLZero: Direct Policy Inference from Language Without In-Domain Supervision0
Efficient Reinforcement Learning Development with RLzoo0
RMIX: Learning Risk-Sensitive Policies forCooperative Reinforcement Learning Agents0
RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents0
RMIX: Risk-Sensitive Multi-Agent Reinforcement Learning0
ROAD: Responsibility-Oriented Reward Design for Reinforcement Learning in Autonomous Driving0
Roadside Units Assisted Localized Automated Vehicle Maneuvering: An Offline Reinforcement Learning Approach0
ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2.0 Based ASR0
Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning0
Robo-advising: Learning Investors' Risk Preferences via Portfolio Choices0
RoboAssembly: Learning Generalizable Furniture Assembly Policy in a Novel Multi-robot Contact-rich Simulation Environment0
Robot Deformable Object Manipulation via NMPC-generated Demonstrations in Deep Reinforcement Learning0
Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning0
Robotic Arm Control and Task Training through Deep Reinforcement Learning0
Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning0
Show:102550
← PrevPage 218 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified