SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 201250 of 382 papers

TitleStatusHype
Foresee then Evaluate: Decomposing Value Estimation with Latent Future PredictionCode0
Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal LogicCode0
MobILE: Model-Based Imitation Learning From Observation AloneCode0
Sim-Env: Decoupling OpenAI Gym Environments from Simulation ModelsCode0
Deluca -- A Differentiable Control Library: Environments, Methods, and BenchmarkingCode1
Transferring Domain Knowledge with an Adviser in Continuous Tasks0
Learning from Demonstrations using Signal Temporal Logic0
Improving Model-Based Reinforcement Learning with Internal State Representations through Self-SupervisionCode1
Neurogenetic Programming Framework for Explainable Reinforcement LearningCode0
LongiControl: A Reinforcement Learning Environment for Longitudinal Vehicle ControlCode1
Explainable Reinforcement Learning for Longitudinal ControlCode1
Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym0
BF++: a language for general-purpose program synthesisCode0
Developing an OpenAI Gym-compatible framework and simulation environment for testing Deep Reinforcement Learning agents solving the Ambulance Location ProblemCode1
Faults in Deep Reinforcement Learning Programs: A Taxonomy and A Detection ApproachCode0
Error Controlled Actor-Critic Method to Reinforcement Learning0
Deep Q Learning from Dynamic Demonstration with Behavioral Cloning0
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup0
Reinforcement Learning for Control of ValvesCode1
myGym: Modular Toolkit for Visuomotor Robotic Tasks0
CityLearn: Standardizing Research in Multi-Agent Reinforcement Learning for Demand Response and Urban Energy ManagementCode1
Evading Web Application Firewalls with Reinforcement Learning0
Evolutionary learning of interpretable decision treesCode0
Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game TheoryCode0
NavRep: Unsupervised Representations for Reinforcement Learning of Robot Navigation in Dynamic Human EnvironmentsCode1
ACN-Sim: An Open-Source Simulator for Data-Driven Electric Vehicle Charging ResearchCode1
Revisiting Maximum Entropy Inverse Reinforcement Learning: New Perspectives and AlgorithmsCode1
NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing TasksCode1
Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and BenchmarkingCode1
SoftGym: Benchmarking Deep Reinforcement Learning for Deformable Object ManipulationCode1
Ecole: A Gym-like Library for Machine Learning in Combinatorial Optimization SolversCode1
Amortized Variational Deep Q NetworkCode0
Control with adaptive Q-learningCode0
LagNetViP: A Lagrangian Neural Network for Video Prediction0
Proximal Policy Gradient: PPO with Policy Gradient0
Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous ControlCode1
What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function ApproximatorCode1
Deep Learning of Koopman Representation for Control0
A Learning Approach to Robot-Agnostic Force-Guided High Precision Assembly0
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
MADRaS : Multi Agent Driving Simulator0
Deep Reinforcement Learning with Mixed Convolutional Network0
MDP Playground: Controlling Orthogonal Dimensions of Hardness in Toy Environments0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator0
VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement LearningCode0
Extended Radial Basis Function Controller for Reinforcement Learning0
Optimality-based Analysis of XCSF Compaction in Discrete Reinforcement LearningCode0
On the model-based stochastic value gradient for continuous reinforcement learningCode1
Reinforcement Learning with Quantum Variational CircuitsCode0
Show:102550
← PrevPage 5 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified