SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 151200 of 382 papers

TitleStatusHype
Direct Mutation and Crossover in Genetic Algorithms Applied to Reinforcement Learning Tasks0
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning0
Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems0
Distributionally Robust Statistical Verification with Imprecise Neural Networks0
Double A3C: Deep Reinforcement Learning on OpenAI Gym Games0
DQN with model-based exploration: efficient learning on environments with sparse rewards0
DriverGym: Democratising Reinforcement Learning for Autonomous Driving0
Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization0
EasyRL: A Simple and Extensible Reinforcement Learning Framework0
Elastic Step DQN: A novel multi-step algorithm to alleviate overestimation in Deep QNetworks0
Enhancing Cyber Resilience of Networked Microgrids using Vertical Federated Reinforcement Learning0
Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms0
Enhancing Privacy and Security of Autonomous UAV Navigation0
Error Controlled Actor-Critic Method to Reinforcement Learning0
Evading Web Application Firewalls with Reinforcement Learning0
Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator0
Implicit Two-Tower Policies0
Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym0
Influence-Based Reinforcement Learning for Intrinsically-Motivated Agents0
In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study0
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning0
Investigating Reinforcement Learning Agents for Continuous State Space Environments0
LagNetViP: A Lagrangian Neural Network for Video Prediction0
Multitask Neuroevolution for Reinforcement Learning with Long and Short Episodes0
Learn a Prior for RHEA for Better Online Planning0
Learning Environment Models with Continuous Stochastic Dynamics0
Learning from Demonstrations using Signal Temporal Logic0
Learning Gaussian Policies from Corrective Human Feedback0
Local Environment Poisoning Attacks on Federated Reinforcement Learning0
Long N-step Surrogate Stage Reward to Reduce Variances of Deep Reinforcement Learning in Complex Problems0
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym0
Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments0
Machine Learning aided Crop Yield Optimization0
MADRaS : Multi Agent Driving Simulator0
MAGICS: Adversarial RL with Minimax Actors Guided by Implicit Critic Stackelberg for Convergent Neural Synthesis of Robot Safety0
MARTI-4: new model of human brain, considering neocortex and basal ganglia -- learns to play Atari game by reinforcement learning on a single CPU0
MDP Playground: Controlling Orthogonal Dimensions of Hardness in Toy Environments0
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn0
Model-based actor-critic: GAN (model generator) + DRL (actor-critic) => AGI0
Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees0
Modelling non-reinforced preferences using selective attention0
MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees0
MR-iNet Gym: Framework for Edge Deployment of Deep Reinforcement Learning on Embedded Software Defined Radio0
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation0
MultiSlot ReRanker: A Generic Model-based Re-Ranking Framework in Recommendation Systems0
Compositional Q-learning for electrolyte repletion with imbalanced patient sub-populations0
Nested Policy Reinforcement Learning for Clinical Decision Support0
Neural architecture impact on identifying temporally extended Reinforcement Learning tasks0
Neural Episodic Control with State Abstraction0
Neuron as an Agent0
Show:102550
← PrevPage 4 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified