SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 101150 of 382 papers

TitleStatusHype
Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques0
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints0
Implicit Two-Tower Policies0
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment0
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents0
Deep Learning of Koopman Representation for Control0
Deep Reinforcement Learning for ESG financial portfolio management0
Affine Transport for Sim-to-Real Domain Adaptation0
Behavior Cloning in OpenAI using Case Based Reasoning0
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks0
Adversarial Exploration Strategy for Self-Supervised Imitation Learning0
Attention Loss Adjusted Prioritized Experience Replay0
A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning0
Benchmarking Algorithms from Machine Learning for Low-Budget Black-Box Optimization0
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning0
Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets0
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies0
Adversarial Body Shape Search for Legged Robots0
Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym0
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization0
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning0
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning0
Learning Gaussian Policies from Corrective Human Feedback0
Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems0
Distributionally Robust Statistical Verification with Imprecise Neural Networks0
Double A3C: Deep Reinforcement Learning on OpenAI Gym Games0
Data Driven Control with Learned Dynamics: Model-Based versus Model-Free Approach0
Curiosity-Driven Experience Prioritization via Density Estimation0
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup0
CT-DQN: Control-Tutored Deep Reinforcement Learning0
CrowdPlay: Crowdsourcing human demonstration data for offline learning in Atari games0
A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning0
A Strategy-Oriented Bayesian Soft Actor-Critic Model0
Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning0
Correcting discount-factor mismatch in on-policy policy gradient methods0
Hypothesis Driven Coordinate Ascent for Reinforcement Learning0
Illuminating Spaces: Deep Reinforcement Learning and Laser-Wall Partitioning for Architectural Layout Generation0
Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control0
HomeLabGym: A real-world testbed for home energy management systems0
Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial0
A Closed-Loop Multi-perspective Visual Servoing Approach with Reinforcement Learning0
Human AI interaction loop training: New approach for interactive reinforcement learning0
Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces0
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning0
gym-saturation: Gymnasium environments for saturation provers (System description)0
A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning0
Decision-Making in Reinforcement Learning0
HoME: a Household Multimodal Environment0
Hybrid Policies Using Inverse Rewards for Reinforcement Learning0
Imaginary Hindsight Experience Replay: Curious Model-based Learning for Sparse Reward Tasks0
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified