SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 301325 of 382 papers

TitleStatusHype
Illuminating Spaces: Deep Reinforcement Learning and Laser-Wall Partitioning for Architectural Layout Generation0
Imaginary Hindsight Experience Replay: Curious Model-based Learning for Sparse Reward Tasks0
Implementing Reinforcement Learning Algorithms in Retail Supply Chains with OpenAI Gym Toolkit0
Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques0
Implicit Two-Tower Policies0
Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym0
Influence-Based Reinforcement Learning for Intrinsically-Motivated Agents0
In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study0
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning0
Investigating Reinforcement Learning Agents for Continuous State Space Environments0
LagNetViP: A Lagrangian Neural Network for Video Prediction0
Multitask Neuroevolution for Reinforcement Learning with Long and Short Episodes0
Learn a Prior for RHEA for Better Online Planning0
Learning Environment Models with Continuous Stochastic Dynamics0
Learning from Demonstrations using Signal Temporal Logic0
Learning Gaussian Policies from Corrective Human Feedback0
Local Environment Poisoning Attacks on Federated Reinforcement Learning0
Long N-step Surrogate Stage Reward to Reduce Variances of Deep Reinforcement Learning in Complex Problems0
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym0
Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments0
Machine Learning aided Crop Yield Optimization0
MADRaS : Multi Agent Driving Simulator0
MAGICS: Adversarial RL with Minimax Actors Guided by Implicit Critic Stackelberg for Convergent Neural Synthesis of Robot Safety0
MARTI-4: new model of human brain, considering neocortex and basal ganglia -- learns to play Atari game by reinforcement learning on a single CPU0
MDP Playground: Controlling Orthogonal Dimensions of Hardness in Toy Environments0
Show:102550
← PrevPage 13 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified