SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97019725 of 15113 papers

TitleStatusHype
Grid-Interactive Multi-Zone Building Control Using Reinforcement Learning with Global-Local Policy Search0
Random Network Distillation as a Diversity Metric for Both Image and Text Generation0
Measuring Visual Generalization in Continuous Control from PixelsCode1
Model-Based Reinforcement Learning for Type 1Diabetes Blood Glucose Control0
Balancing Constraints and Rewards with Meta-Gradient D4PG0
Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning0
Deep Reinforcement Learning for Real-Time Optimization of Pumps in Water Distribution SystemsCode1
Deep Reinforcement Learning and Transportation Research: A Comprehensive Review0
Efficient Wasserstein Natural Gradients for Reinforcement LearningCode1
Human-centric Dialog Training via Offline Reinforcement Learning0
Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning?0
Smaller World Models for Reinforcement Learning0
AttendLight: Universal Attention-Based Reinforcement Learning Model for Traffic Signal Control0
Local Search for Policy Iteration in Continuous Control0
The Greatest Teacher, Failure is: Using Reinforcement Learning for SFC Placement Based on Availability and Energy Consumption0
Remote Electrical Tilt Optimization via Safe Reinforcement Learning0
Nearly Minimax Optimal Reward-free Reinforcement Learning0
Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems0
Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks0
Distributed Resource Allocation with Multi-Agent Deep Reinforcement Learning for 5G-V2V CommunicationCode1
Contrastive Explanations for Reinforcement Learning via Embedded Self PredictionsCode0
Safe Reinforcement Learning with Natural Language Constraints0
Reinforcement Learning on Computational Resource Allocation of Cloud-based Wireless Networks0
Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model UncertaintyCode0
Trust the Model When It Is Confident: Masked Model-based Actor-Critic0
Show:102550
← PrevPage 389 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified