SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 351382 of 382 papers

TitleStatusHype
Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning0
Visual Transfer between Atari Games using Competitive Reinforcement LearningCode0
GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware0
FuzzerGym: A Competitive Framework for Fuzzing and Learning0
Online Robust Policy Learning in the Presence of Unknown Adversaries0
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network0
Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces0
Deep Reinforcement Learning for General Video Game AICode0
BindsNET: A machine learning-oriented spiking neural networks library in PythonCode0
Intelligent Trainer for Model-Based Reinforcement LearningCode0
Advances in Experience ReplayCode0
GAN Q-learningCode0
Deep Reinforcement Learning for Playing 2.5D Fighting GamesCode0
State Distribution-aware Sampling for Deep Q-learning0
Structured Evolution with Compact Architectures for Scalable Policy Optimization0
Recurrent Predictive State Policy NetworksCode0
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for ResearchCode0
Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design0
Neuron as an Agent0
Combining Model-based and Model-free RL via Multi-step Control Variates0
HoME: a Household Multimodal Environment0
A novel DDPG method with prioritized experience replayCode0
MDP environments for the OpenAI GymCode0
Closing the loop between neural network simulators and the OpenAI Gym0
Benchmark Environments for Multitask Learning in Continuous DomainsCode0
Investigating Reinforcement Learning Agents for Continuous State Space Environments0
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning0
Non-Markovian Control with Gated End-to-End Memory Policy Networks0
AIXIjs: A Software Demo for General Reinforcement LearningCode0
Beating Atari with Natural Language Guided Reinforcement LearningCode0
Towards Generalization and Simplicity in Continuous ControlCode0
Collaborative Deep Reinforcement LearningCode0
Show:102550
← PrevPage 8 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified