SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 251300 of 382 papers

TitleStatusHype
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents0
EasyRL: A Simple and Extensible Reinforcement Learning Framework0
Integrating Deep Reinforcement Learning Networks with Health System SimulationsCode1
Implicit Distributional Reinforcement LearningCode1
OtoWorld: Towards Learning to Separate by Learning to MoveCode1
EVO-RL: Evolutionary-Driven Reinforcement Learning0
Concept and the implementation of a tool to convert industry 4.0 environments modeled as FSM to an OpenAI Gym wrapper0
Experience Replay with Likelihood-free Importance WeightsCode1
Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees0
WD3: Taming the Estimation Bias in Deep Reinforcement Learning0
Data Driven Control with Learned Dynamics: Model-Based versus Model-Free Approach0
Balancing a CartPole System with Reinforcement Learning -- A Tutorial0
Refined Continuous Control of DDPG Actors via Parametrised Activation0
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning0
Reinforcement Learning with Augmented DataCode1
Analyzing Reinforcement Learning Benchmarks with Random Weight GuessingCode0
Policy Gradient using Weak Derivatives for Reinforcement Learning0
Model-based actor-critic: GAN (model generator) + DRL (actor-critic) => AGI0
Neural Game Engine: Accurate learning of generalizable forward models from pixelsCode1
Human AI interaction loop training: New approach for interactive reinforcement learning0
Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts0
State-only Imitation with Transition Dynamics MismatchCode1
Behavior Cloning in OpenAI using Case Based Reasoning0
Adaptive Temporal Difference Learning with Linear Function Approximation0
Adaptive Experience Selection for Policy Gradient0
PDDLGym: Gym Environments from PDDL ProblemsCode1
Discrete Action On-Policy Learning with Action-Value CriticCode0
Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPOCode0
Sample-based Distributional Policy Gradient0
Blue River Controls: A toolkit for Reinforcement Learning Control Systems on HardwareCode1
Adaptive Droplet Routing in Digital Microfluidic Biochips Using Deep Reinforcement Learning0
Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog0
SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement LearningCode0
Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning0
Sepsis World Model: A MIMIC-based OpenAI Gym "World Model" Simulator for Sepsis Treatment0
The PlayStation Reinforcement Learning Environment (PSXLE)Code0
Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learningCode0
Accelerating Reinforcement Learning with Suboptimal Guidance0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Challenging On Car Racing Problem from OpenAI gym0
Towards a Reinforcement Learning Environment Toolbox for Intelligent Electric Motor ControlCode0
Zap Q-Learning With Nonlinear Function Approximation0
MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed ActionsCode0
TorchBeast: A PyTorch Platform for Distributed RLCode0
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement LearningCode1
SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning0
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous ControlCode0
Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning0
Self-Supervised State-Control through Intrinsic Mutual Information RewardsCode0
MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees0
Show:102550
← PrevPage 6 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified