SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 351382 of 382 papers

TitleStatusHype
Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation0
Proximal Policy Gradient: PPO with Policy Gradient0
Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution0
QF-tuner: Breaking Tradition in Reinforcement Learning0
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network0
Quality Diversity Evolutionary Learning of Decision Trees0
Reward Prediction Error as an Exploration Objective in Deep RL0
RAIL: A modular framework for Reinforcement-learning-based Adversarial Imitation Learning0
RangL: A Reinforcement Learning Competition Platform0
The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability0
Recommendation System-based Upper Confidence Bound for Online Advertising0
A Learning Approach to Robot-Agnostic Force-Guided High Precision Assembly0
WD3: Taming the Estimation Bias in Deep Reinforcement Learning0
Refined Continuous Control of DDPG Actors via Parametrised Activation0
REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using Reinforcement Learning Agents0
Reinforcement Learning Approach for Multi-Agent Flexible Scheduling Problems0
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction0
Reinforcement Learning using Guided Observability0
Relative Importance Sampling for off-Policy Actor-Critic in Deep Reinforcement Learning0
Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning0
Resilient Control of Networked Microgrids using Vertical Federated Reinforcement Learning: Designs and Real-Time Test-Bed Validations0
Rethinking Population-assisted Off-policy Reinforcement Learning0
Robustness Evaluation of Offline Reinforcement Learning for Robot Control Against Action Perturbations0
Sample-based Distributional Policy Gradient0
Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing0
Scilab-RL: A software framework for efficient reinforcement learning and cognitive modeling research0
Sepsis World Model: A MIMIC-based OpenAI Gym "World Model" Simulator for Sepsis Treatment0
Sequential Learning of Movement Prediction in Dynamic Environments using LSTM Autoencoder0
Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning0
SIMILE: Introducing Sequential Information towards More Effective Imitation Learning0
Soft Actor-Critic with Inhibitory Networks for Faster Retraining0
State Distribution-aware Sampling for Deep Q-learning0
Show:102550
← PrevPage 8 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified