SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 251300 of 382 papers

TitleStatusHype
Sufficient Exploration for Convex Q-learning0
SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning0
Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning0
Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning0
Teaching a Robot to Walk Using Reinforcement Learning0
Towards Brain-inspired System: Deep Recurrent Reinforcement Learning for Simulated Self-driving Agent0
Towards Characterizing Divergence in Deep Q-Learning0
Towards Combining On-Off-Policy Methods for Real-World Applications0
Towards Physically Safe Reinforcement Learning under Supervision0
Traffic control using intelligent timing of traffic lights with reinforcement learning technique and real-time processing of surveillance camera images0
Transferring Domain Knowledge with an Adviser in Continuous Tasks0
Untangling Braids with Multi-agent Q-Learning0
Utilizing Skipped Frames in Action Repeats via Pseudo-Actions0
Value-Based Deep RL Scales Predictably0
VisualEnv: visual Gym environments with Blender0
Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog0
Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning0
Zap Q-Learning With Nonlinear Function Approximation0
Gym-preCICE: Reinforcement Learning Environments for Active Flow Control0
Gym-saturation: an OpenAI Gym environment for saturation provers0
gym-saturation: Gymnasium environments for saturation provers (System description)0
HoME: a Household Multimodal Environment0
HomeLabGym: A real-world testbed for home energy management systems0
Human AI interaction loop training: New approach for interactive reinforcement learning0
Hybrid Policies Using Inverse Rewards for Reinforcement Learning0
Hypothesis Driven Coordinate Ascent for Reinforcement Learning0
Illuminating Spaces: Deep Reinforcement Learning and Laser-Wall Partitioning for Architectural Layout Generation0
Imaginary Hindsight Experience Replay: Curious Model-based Learning for Sparse Reward Tasks0
Implementing Reinforcement Learning Algorithms in Retail Supply Chains with OpenAI Gym Toolkit0
Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Continuous Control With Ensemble Deep Deterministic Policy GradientsCode0
Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPOCode0
Mimicking Better by Matching the Approximate Action DistributionCode0
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI GymCode0
Decision Mamba ArchitecturesCode0
HistoGym: A Reinforcement Learning Environment for Histopathological Image AnalysisCode0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learningCode0
SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory SystemsCode0
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement LearningCode0
Towards Generalization and Simplicity in Continuous ControlCode0
Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based ApproachCode0
Project proposal: A modular reinforcement learning based automated theorem proverCode0
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement LearningCode0
Guaranteeing Control Requirements via Reward Shaping in Reinforcement LearningCode0
SDGym: Low-Code Reinforcement Learning Environments using System Dynamics ModelsCode0
Provably Efficient Imitation Learning from Observation AloneCode0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Show:102550
← PrevPage 6 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified