SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 95019525 of 15113 papers

TitleStatusHype
On the role of planning in model-based deep reinforcement learning0
Reliable Off-policy Evaluation for Reinforcement Learning0
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient0
Exploring market power using deep reinforcement learning for intelligent bidding strategies0
Drafting in Collectible Card Games via Reinforcement LearningCode1
A Reinforcement Learning Approach to the Orienteering Problem with Time WindowsCode1
Universal Activation Function For Machine Learning0
Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial0
Sample-efficient Reinforcement Learning in Robotic Table Tennis0
Motion Prediction on Self-driving Cars: A Review0
The Value Equivalence Principle for Model-Based Reinforcement Learning0
Playing optical tweezers with deep reinforcement learning: in virtual, physical and augmented environments0
RealAnt: An Open-Source Low-Cost Quadruped for Education and Research in Real-World Reinforcement LearningCode1
A Hysteretic Q-learning Coordination Framework for Emerging Mobility Systems in Smart Cities0
LBGP: Learning Based Goal Planning for Autonomous Following in Front0
Learning a Decentralized Multi-arm Motion PlannerCode1
Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping0
XCSF for Automatic Test Case PrioritizationCode0
Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods0
Learning Trajectories for Visual-Inertial System Calibration via Model-based Heuristic Deep Reinforcement LearningCode1
Generative Inverse Deep Reinforcement Learning for Online Recommendation0
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks0
Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning0
Deep Reinforcement Learning Based Dynamic Route Planning for Minimizing Travel Time0
Control with adaptive Q-learningCode0
Show:102550
← PrevPage 381 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified