SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 72517275 of 15113 papers

TitleStatusHype
Safe Reinforcement Learning for Legged Locomotion0
Target Network and Truncation Overcome The Deadly Triad in Q-Learning0
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions0
GraspARL: Dynamic Grasping via Adversarial Reinforcement Learning0
Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning0
Intrinsically-Motivated Reinforcement Learning: A Brief Introduction0
Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model0
Deep Q-network using reservoir computing with multi-layered readout0
Reasoning about Counterfactuals to Improve Human Inverse Reinforcement LearningCode0
Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches0
Quantum Reinforcement Learning via Policy Iteration0
Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP0
Reliable validation of Reinforcement Learning Benchmarks0
Evolving Curricula with Regret-Based Environment Design0
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open ProblemsCode0
Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning0
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power SystemsCode0
Learning in Sparse Rewards settings through Quality-Diversity algorithms0
Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from ImagesCode0
DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction0
Hierarchical Reinforcement Learning with AI Planning ModelsCode0
Distributional Reinforcement Learning for Scheduling of Chemical Production Processes0
Show:102550
← PrevPage 291 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified