SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 101150 of 382 papers

TitleStatusHype
Deconfounding Reinforcement Learning in Observational SettingsCode0
Deep Q-learning: a robust control approachCode0
Decision Making in Non-Stationary Environments with Policy-Augmented SearchCode0
Neurogenetic Programming Framework for Explainable Reinforcement LearningCode0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
Efficient Parallel Reinforcement Learning Framework using the Reactor ModelCode0
Proximal Distilled Evolutionary Reinforcement LearningCode0
Deep Reinforcement Learning for General Video Game AICode0
Deep Reinforcement Learning for Playing 2.5D Fighting GamesCode0
Deep Reinforcement Learning with Feedback-based ExplorationCode0
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic SpacesCode0
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for ResearchCode0
MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed ActionsCode0
Creating Hierarchical Dispositions of Needs in an AgentCode0
Arena: a toolkit for Multi-Agent Reinforcement LearningCode0
Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal LogicCode0
Control with adaptive Q-learningCode0
MDP environments for the OpenAI GymCode0
A quantum-classical reinforcement learning model to play Atari gamesCode0
Advances in Experience ReplayCode0
MDP Playground: An Analysis and Debug Testbed for Reinforcement LearningCode0
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLMCode0
Continuous Control With Ensemble Deep Deterministic Policy GradientsCode0
Discrete Action On-Policy Learning with Action-Value CriticCode0
Bridging Dimensions: Confident Reachability for High-Dimensional ControllersCode0
Reinforcement Learning with Quantum Variational CircuitsCode0
Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPOCode0
A novel DDPG method with prioritized experience replayCode0
Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch SchedulingCode0
Towards Interactive Training of Non-Player Characters in Video GamesCode0
Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based ApproachCode0
Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic ControlCode0
Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement LearningCode0
Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown EnvironmentsCode0
IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-TuningCode0
Collaborative Deep Reinforcement LearningCode0
Evolutionary learning of interpretable decision treesCode0
Intelligent Trainer for Model-Based Reinforcement LearningCode0
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power SystemsCode0
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic MethodsCode0
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI GymCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Decision Mamba ArchitecturesCode0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Generative Adversarial Neuroevolution for Control Behaviour ImitationCode0
Guaranteeing Control Requirements via Reward Shaping in Reinforcement LearningCode0
Adaptively Calibrated Critic Estimates for Deep Reinforcement LearningCode0
GAN Q-learningCode0
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy ImitationCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified