SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 67766800 of 15113 papers

TitleStatusHype
Towards Smarter Sensing: 2D Clutter Mitigation in RL-Driven Cognitive MIMO Radar0
Towards Socially and Morally Aware RL agent: Reward Design With LLM0
Towards Synthesizing Complex Programs from Input-Output Examples0
Towards Task-Prioritized Policy Composition0
Towards Understanding Chinese Checkers with Heuristics, Monte Carlo Tree Search, and Deep Reinforcement Learning0
Towards Understanding Deep Policy Gradients: A Case Study on PPO0
The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
Towards Understanding the Benefit of Multitask Representation Learning in Decision Process0
Towards Unknown-aware Deep Q-Learning0
Towards Unraveling and Improving Generalization in World Models0
Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning0
Towards using Reinforcement Learning for Scaling and Data Replication in Cloud Systems0
Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems0
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control0
Towards White-box Benchmarks for Algorithm Control0
Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning0
TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback0
TrackAgent: 6D Object Tracking via Reinforcement Learning0
Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision0
Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning0
Tracking Control for a Spherical Pendulum via Curriculum Reinforcement Learning0
Tracking the Race Between Deep Reinforcement Learning and Imitation Learning -- Extended Version0
Track-MDP: Reinforcement Learning for Target Tracking with Controlled Sensing0
Tractable Offline Learning of Regular Decision Processes0
Show:102550
← PrevPage 272 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified