SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 201250 of 382 papers

TitleStatusHype
VisualEnv: visual Gym environments with Blender0
DriverGym: Democratising Reinforcement Learning for Autonomous Driving0
AWD3: Dynamic Reduction of the Estimation Bias0
Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution0
Machine Learning aided Crop Yield Optimization0
DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention0
REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using Reinforcement Learning Agents0
Compositional Q-learning for electrolyte repletion with imbalanced patient sub-populations0
Imaginary Hindsight Experience Replay: Curious Model-based Learning for Sparse Reward Tasks0
Benchmarking Algorithms from Machine Learning for Low-Budget Black-Box Optimization0
Hypothesis Driven Coordinate Ascent for Reinforcement Learning0
Nested Policy Reinforcement Learning for Clinical Decision Support0
Untangling Braids with Multi-agent Q-Learning0
CrowdPlay: Crowdsourcing human demonstration data for offline learning in Atari games0
Experience Replay More When It's a Key Transition in Deep Reinforcement Learning0
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic MethodsCode0
Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning0
An Oracle and Observations for the OpenAI Gym / ALE Freeway Environment0
Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge DistillationCode0
Photonic Quantum Policy Learning in OpenAI Gym0
Influence-Based Reinforcement Learning for Intrinsically-Motivated Agents0
An Independent Study of Reinforcement Learning and Autonomous Driving0
An Analysis of Reinforcement Learning for Malaria Control0
Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based ApproachCode0
Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning0
Offline Inverse Reinforcement Learning0
Exploration and preference satisfaction trade-off in reward-free learning0
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning0
Affine Transport for Sim-to-Real Domain Adaptation0
A Generalised Inverse Reinforcement Learning Framework0
Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial0
RAIL: A modular framework for Reinforcement-learning-based Adversarial Imitation Learning0
Utilizing Skipped Frames in Action Repeats via Pseudo-Actions0
Implementing Reinforcement Learning Algorithms in Retail Supply Chains with OpenAI Gym Toolkit0
Reinforcement Learning using Guided Observability0
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement LearningCode0
Optimism is All You Need: Model-Based Imitation Learning From Observation Alone0
Foresee then Evaluate: Decomposing Value Estimation with Latent Future PredictionCode0
Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal LogicCode0
MobILE: Model-Based Imitation Learning From Observation AloneCode0
Sim-Env: Decoupling OpenAI Gym Environments from Simulation ModelsCode0
Transferring Domain Knowledge with an Adviser in Continuous Tasks0
Learning from Demonstrations using Signal Temporal Logic0
Neurogenetic Programming Framework for Explainable Reinforcement LearningCode0
Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym0
BF++: a language for general-purpose program synthesisCode0
Faults in Deep Reinforcement Learning Programs: A Taxonomy and A Detection ApproachCode0
Deep Q Learning from Dynamic Demonstration with Behavioral Cloning0
Error Controlled Actor-Critic Method to Reinforcement Learning0
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup0
Show:102550
← PrevPage 5 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified