SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 94519475 of 15113 papers

TitleStatusHype
Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning0
Deep Reinforcement Learning of Transition States0
DeepMind Lab2DCode1
Active Reinforcement Learning: Observing Rewards at a Cost0
A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges0
Imposing Robust Structured Control Constraint on Reinforcement Learning of Linear Quadratic Regulator0
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement LearningCode1
Hierarchical reinforcement learning for efficient exploration and transfer0
Griddly: A platform for AI research in games0
Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot0
Reinforcement Learning with Videos: Combining Offline Observations with InteractionCode1
Steady State Analysis of Episodic Reinforcement Learning0
Optimizing Large-Scale Fleet Management on a Road Network using Multi-Agent Deep Reinforcement Learning with Graph Neural NetworkCode1
Adaptive Neural Architectures for Recommender Systems0
Non-local Optimization: Imposing Structure on Optimization Problems by Relaxation0
pymgrid: An Open-Source Python Microgrid Simulator for Applied Artificial Intelligence ResearchCode1
Reinforcement Learning Experiments and Benchmark for Solving Robotic Reaching TasksCode0
Proximal Policy Optimization via Enhanced Exploration Efficiency0
Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning0
Reinforcement Learning with Dual-Observation for General Video Game PlayingCode0
Reinforcement Learning with Time-dependent Goals for Robotic Musicians0
Decentralized Motion Planning for Multi-Robot Navigation using Deep Reinforcement LearningCode1
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension0
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee0
Behaviorally Diverse Traffic Simulation via Reinforcement Learning0
Show:102550
← PrevPage 379 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified