SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 301350 of 382 papers

TitleStatusHype
Towards a Reinforcement Learning Environment Toolbox for Intelligent Electric Motor ControlCode0
Zap Q-Learning With Nonlinear Function Approximation0
MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed ActionsCode0
TorchBeast: A PyTorch Platform for Distributed RLCode0
SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning0
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous ControlCode0
Self-Supervised State-Control through Intrinsic Mutual Information RewardsCode0
MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees0
Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning0
Active inference: demystified and comparedCode0
Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement LearningCode0
MDP Playground: An Analysis and Debug Testbed for Reinforcement LearningCode0
Recommendation System-based Upper Confidence Bound for Online Advertising0
Arena: a toolkit for Multi-Agent Reinforcement LearningCode0
A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
Proximal Distilled Evolutionary Reinforcement LearningCode0
Reward Prediction Error as an Exploration Objective in Deep RL0
Towards Interactive Training of Non-Player Characters in Video GamesCode0
Decision-Making in Reinforcement Learning0
Provably Efficient Imitation Learning from Observation AloneCode0
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment0
In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study0
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction0
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning0
Deep Ordinal Reinforcement LearningCode0
Adversarial Exploration Strategy for Self-Supervised Imitation Learning0
SIMILE: Introducing Sequential Information towards More Effective Imitation Learning0
Towards Combining On-Off-Policy Methods for Real-World Applications0
Evolving Neural Networks in Reinforcement Learning by means of UMDAc0
Towards Brain-inspired System: Deep Recurrent Reinforcement Learning for Simulated Self-driving Agent0
DQN with model-based exploration: efficient learning on environments with sparse rewards0
Towards Characterizing Divergence in Deep Q-Learning0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Deep Reinforcement Learning with Feedback-based ExplorationCode0
Learning Gaussian Policies from Corrective Human Feedback0
Deep Active LocalizationCode0
Flappy Hummingbird: An Open Source Dynamic Simulation of Flapping Wing Robots and AnimalsCode0
Curiosity-Driven Experience Prioritization via Density Estimation0
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement LearningCode0
Learn a Prior for RHEA for Better Online Planning0
Towards Physically Safe Reinforcement Learning under Supervision0
Deconfounding Reinforcement Learning in Observational SettingsCode0
Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic ControlCode0
Relative Entropy Regularized Policy IterationCode0
BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning0
Relative Importance Sampling for off-Policy Actor-Critic in Deep Reinforcement Learning0
Sequential Learning of Movement Prediction in Dynamic Environments using LSTM Autoencoder0
Reinforcement Learning for Improving Agent DesignCode0
Hybrid Policies Using Inverse Rewards for Reinforcement Learning0
Show:102550
← PrevPage 7 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified