SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 51100 of 382 papers

TitleStatusHype
MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement LearningCode1
Integrating Deep Reinforcement Learning Networks with Health System SimulationsCode1
CropGym: a Reinforcement Learning Environment for Crop ManagementCode1
Experience Replay with Likelihood-free Importance WeightsCode1
A Benchmark Environment Motivated by Industrial Control ProblemsCode1
CaiRL: A High-Performance Reinforcement Learning Environment ToolkitCode1
Towards Real-World Deployment of Reinforcement Learning for Traffic Signal ControlCode1
MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated EnvironmentsCode1
LongiControl: A Reinforcement Learning Environment for Longitudinal Vehicle ControlCode1
Blue River Controls: A toolkit for Reinforcement Learning Control Systems on HardwareCode1
Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement LearningCode1
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning AlgorithmsCode1
ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial MarketsCode1
Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration BiasCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing FlowCode1
Deep Recurrent Q-Learning for Partially Observable MDPsCode1
Monte Carlo Tree Search for Asymmetric TreesCode1
CityLearn: Standardizing Research in Multi-Agent Reinforcement Learning for Demand Response and Urban Energy ManagementCode1
NavRep: Unsupervised Representations for Reinforcement Learning of Robot Navigation in Dynamic Human EnvironmentsCode1
PushWorld: A benchmark for manipulation planning with tools and movable obstaclesCode1
Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous ControlCode1
Deluca -- A Differentiable Control Library: Environments, Methods, and BenchmarkingCode1
OMPO: A Unified Framework for RL under Policy and Dynamics ShiftsCode1
CompilerGym: Robust, Performant Compiler Optimization Environments for AI ResearchCode1
An Open-Source Multi-Goal Reinforcement Learning Environment for Robotic Manipulation with PybulletCode1
Avalanche RL: a Continual Reinforcement Learning LibraryCode1
PDDLGym: Gym Environments from PDDL ProblemsCode1
Bridging Dimensions: Confident Reachability for High-Dimensional ControllersCode0
Amortized Variational Deep Q NetworkCode0
AIXIjs: A Software Demo for General Reinforcement LearningCode0
BindsNET: A machine learning-oriented spiking neural networks library in PythonCode0
BF++: a language for general-purpose program synthesisCode0
Benchmark Environments for Multitask Learning in Continuous DomainsCode0
Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement LearningCode0
Beating Atari with Natural Language Guided Reinforcement LearningCode0
Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown EnvironmentsCode0
IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-TuningCode0
Intelligent Trainer for Model-Based Reinforcement LearningCode0
Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic ControlCode0
HistoGym: A Reinforcement Learning Environment for Histopathological Image AnalysisCode0
Deep Active LocalizationCode0
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI GymCode0
Decision Making in Non-Stationary Environments with Policy-Augmented SearchCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Decision Mamba ArchitecturesCode0
Deconfounding Reinforcement Learning in Observational SettingsCode0
Foresee then Evaluate: Decomposing Value Estimation with Latent Future PredictionCode0
Flappy Hummingbird: An Open Source Dynamic Simulation of Flapping Wing Robots and AnimalsCode0
Show:102550
← PrevPage 2 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified