SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 53265350 of 15113 papers

TitleStatusHype
Rewardless Open-Ended Learning (ROEL)0
Reward Machine Inference for Robotic Manipulation0
Reward (Mis)design for Autonomous Driving0
Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning0
Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments0
Reward prediction for representation learning and reward shaping0
Reward-Predictive Clustering0
STIR^2: Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks0
Reward-Respecting Subtasks for Model-Based Reinforcement Learning0
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning0
Reward Shaping for Reinforcement Learning with Omega-Regular Objectives0
Reward Shaping for User Satisfaction in a REINFORCE Recommender0
Reward Shaping via Diffusion Process in Reinforcement Learning0
Reward Shaping via Meta-Learning0
Reward Shaping with Dynamic Trajectory Aggregation0
Reward Shaping with Subgoals for Social Navigation0
RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation0
Rewards with Negative Examples for Reinforced Topic-Focused Abstractive Summarization0
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective0
Reward Training Wheels: Adaptive Auxiliary Rewards for Robotics Reinforcement Learning0
REX: Rapid Exploration and eXploitation for AI Agents0
ReZero: Enhancing LLM search ability by trying one-more-time0
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration0
Riemannian Stochastic Gradient Method for Nested Composition Optimization0
RILe: Reinforced Imitation Learning0
Show:102550
← PrevPage 214 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified