SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1035110375 of 15113 papers

TitleStatusHype
Accelerated Deep Reinforcement Learning Based Load Shedding for Emergency Voltage Control0
Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning with Application to Autonomous Driving0
dm_control: Software and Tasks for Continuous Control0
Graph Neural Networks and Reinforcement Learning for Behavior Generation in Semantic EnvironmentsCode1
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret0
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs0
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning0
Safe Reinforcement Learning via Curriculum InductionCode1
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data0
Near-Optimal Reinforcement Learning with Self-Play0
Learning with AMIGo: Adversarially Motivated Intrinsic GoalsCode1
Ecological Reinforcement Learning0
Constrained Combinatorial Optimization with Reinforcement Learning0
Hierarchical Reinforcement Learning for Deep Goal Reasoning: An Expressiveness Analysis0
Reinforcement Learning for Mean Field Games with Strategic Complementarities0
Gradient-EM Bayesian Meta-learning0
Automated Optical Multi-layer Design via Deep Reinforcement LearningCode0
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning0
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement LearningCode1
Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation0
Towards Tractable Optimism in Model-Based Reinforcement Learning0
Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees0
Entropic Risk Constrained Soft-Robust Policy Optimization0
Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies0
Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms0
Show:102550
← PrevPage 415 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified