SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1305113100 of 15113 papers

TitleStatusHype
Total stochastic gradient algorithms and applications in reinforcement learning0
Learning to Schedule Communication in Multi-agent Reinforcement LearningCode0
AlphaStar: An Evolutionary Computation Perspective0
Interactively shaping robot behaviour with unlabeled human instructions0
Adaptive Stress Testing for Autonomous Vehicles0
The Natural Language of ActionsCode0
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of ChaosCode0
Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems0
A Meta-MDP Approach to Exploration for Lifelong Reinforcement LearningCode0
Learning User Preferences via Reinforcement Learning with Spatial Interface Valuing0
When Collaborative Filtering Meets Reinforcement Learning0
Non-asymptotic Analysis of Biased Stochastic Approximation Scheme0
Policy Consolidation for Continual Reinforcement LearningCode0
Visual Rationalizations in Deep Reinforcement Learning for Atari Games0
Privacy Preserving Off-Policy Evaluation0
Learning Action Representations for Reinforcement Learning0
Competitive Experience Replay0
Joint Entity Linking with Deep Reinforcement Learning0
An Optimization Framework for Task Sequencing in Curriculum Learning0
A Geometric Perspective on Optimal Representations for Reinforcement Learning0
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization PerspectiveCode0
Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning0
Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning0
The Value Function Polytope in Reinforcement Learning0
Addressing Sample Complexity in Visual Tasks Using HER and Hallucinatory GANsCode0
Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning0
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement0
Privacy-preserving Q-Learning with Functional Noise in Continuous State SpacesCode0
A Comparative Analysis of Expected and Distributional Reinforcement Learning0
A Regulation Enforcement Solution for Multi-agent Reinforcement Learning0
Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networksCode0
Multi-Agent Reinforcement Learning with Multi-Step Generative Models0
Safe, Efficient, and Comfortable Velocity Control based on Reinforcement Learning for Autonomous DrivingCode0
Trust Region-Guided Proximal Policy OptimizationCode0
Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning0
CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments0
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift0
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning0
Reward Shaping via Meta-Learning0
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning0
Action Robust Reinforcement Learning and Applications in Continuous ControlCode0
Emergent Linguistic Phenomena in Multi-Agent Communication GamesCode0
Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization0
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems0
Federated Deep Reinforcement Learning0
Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning0
Dynamic Measurement Scheduling for Event Forecasting using Deep RLCode0
Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based roboticsCode0
Hierarchical Reinforcement Learning for Multi-agent MOBA Game0
Distillation Strategies for Proximal Policy Optimization0
Show:102550
← PrevPage 262 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified