SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 94269450 of 15113 papers

TitleStatusHype
Real-Time Optimal Design of Experiment for Parameter Identification of Li-Ion Cell Electrochemical Model0
Real-time Policy Distillation in Deep Reinforcement Learning0
Real-time scheduling of renewable power systems through planning-based reinforcement learning0
Real-world challenges for multi-agent reinforcement learning in grid-interactive buildings0
The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability0
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning0
Real-World Human-Robot Collaborative Reinforcement Learning0
Real-World Implementation of Reinforcement Learning Based Energy Coordination for a Cluster of Households0
Real World Offline Reinforcement Learning with Realistic Data Source0
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning0
Real-world Video Adaptation with Reinforcement Learning0
Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network0
Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning0
Reasoning Beyond Limits: Advances and Open Problems for LLMs0
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL0
Reasoning with Exploration: An Entropy Perspective0
Reasoning With Hierarchical Symbols: Reclaiming Symbolic Policies For Visual Reinforcement Learning0
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation0
Rebalanced Multimodal Learning with Data-aware Unimodal Sampling0
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback0
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation0
Recall Traces: Backtracking Models for Efficient Reinforcement Learning0
Receding Horizon Differential Dynamic Programming0
Receding Horizon Inverse Reinforcement Learning0
Recent Advances in Reinforcement Learning in Finance0
Show:102550
← PrevPage 378 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified