SOTAVerified

OpenAI Gym

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Papers

Showing 101150 of 382 papers

TitleStatusHype
HistoGym: A Reinforcement Learning Environment for Histopathological Image AnalysisCode0
Adaptive Planning with Generative Models under Uncertainty0
Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms0
A Comprehensive Guide to Combining R and Python code for Data Science, Machine Learning and Reinforcement Learning0
Traffic control using intelligent timing of traffic lights with reinforcement learning technique and real-time processing of surveillance camera images0
Decision Mamba ArchitecturesCode0
SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory SystemsCode0
Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline0
Airlift Challenge: A Competition for Optimizing Cargo Delivery0
Enhancing Privacy and Security of Autonomous UAV Navigation0
HomeLabGym: A real-world testbed for home energy management systems0
Noisy Spiking Actor Network for Exploration0
QF-tuner: Breaking Tradition in Reinforcement Learning0
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic SpacesCode0
Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization0
Scilab-RL: A software framework for efficient reinforcement learning and cognitive modeling research0
MultiSlot ReRanker: A Generic Model-based Re-Ranking Framework in Recommendation Systems0
Decision Making in Non-Stationary Environments with Policy-Augmented SearchCode0
A Closed-Loop Multi-perspective Visual Servoing Approach with Reinforcement Learning0
Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown EnvironmentsCode0
Efficient Parallel Reinforcement Learning Framework using the Reactor ModelCode0
Resilient Control of Networked Microgrids using Vertical Federated Reinforcement Learning: Designs and Real-Time Test-Bed Validations0
Guaranteeing Control Requirements via Reward Shaping in Reinforcement LearningCode0
Bridging Dimensions: Confident Reachability for High-Dimensional ControllersCode0
Repairing Learning-Enabled Controllers While Preserving What WorksCode0
SDGym: Low-Code Reinforcement Learning Environments using System Dynamics ModelsCode0
Neural architecture impact on identifying temporally extended Reinforcement Learning tasks0
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym0
Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques0
gym-saturation: Gymnasium environments for saturation provers (System description)0
Attention Loss Adjusted Prioritized Experience Replay0
Distributionally Robust Statistical Verification with Imprecise Neural Networks0
Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning0
On Combining Expert Demonstrations in Imitation Learning via Optimal Transport0
Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing0
Dynamic Observation Policies in Observation Cost-Sensitive Reinforcement LearningCode0
Learning Environment Models with Continuous Stochastic Dynamics0
Correcting discount-factor mismatch in on-policy policy gradient methods0
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy ImitationCode0
Deep Reinforcement Learning for ESG financial portfolio management0
Mimicking Better by Matching the Approximate Action DistributionCode0
Active Inference in Hebbian Learning Networks0
Risk-Aware Reward Shaping of Reinforcement Learning Agents for Autonomous DrivingCode0
Optimizing Attention and Cognitive Control Costs Using Temporally-Layered ArchitecturesCode0
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning0
Rethinking Population-assisted Off-policy Reinforcement Learning0
Gym-preCICE: Reinforcement Learning Environments for Active Flow Control0
Signal Novelty Detection as an Intrinsic Reward for RoboticsCode0
Exact and Cost-Effective Automated Transformation of Neural Network Controllers to Decision Tree Controllers0
Causal Repair of Learning-enabled Cyber-physical Systems0
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,586.33Unverified
2TD3Average Return5,942.55Unverified
3SACAverage Return5,208.09Unverified
4DDPGAverage Return1,712.12Unverified
5PPOAverage Return608.97Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return15,836.04Unverified
2DDPGAverage Return14,934.86Unverified
3TD3Average Return12,026.73Unverified
4MEowAverage Return10,981.47Unverified
5PPOAverage Return6,006.11Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return3,332.99Unverified
2TD3Average Return3,319.98Unverified
3SACAverage Return2,882.56Unverified
4DDPGAverage Return1,290.24Unverified
5PPOAverage Return790.77Unverified
#ModelMetricClaimedVerifiedStatus
1MEowAverage Return6,923.22Unverified
2SACAverage Return6,211.5Unverified
3PPOAverage Return925.89Unverified
4TD3Average Return198.44Unverified
5DDPGAverage Return139.14Unverified
#ModelMetricClaimedVerifiedStatus
1SACAverage Return5,745.27Unverified
2MEowAverage Return5,526.66Unverified
3DDPGAverage Return2,994.54Unverified
4PPOAverage Return2,739.81Unverified
5TD3Average Return2,612.74Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward5,163.54Unverified
2AWRMean Reward5,067Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return500Unverified
2Oblique decision treeAverage Return500Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,571.99Unverified
2AWRMean Reward9,136Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward3,458.22Unverified
2AWRMean Reward3,405Unverified
#ModelMetricClaimedVerifiedStatus
1Oblique decision treeAverage Return272.14Unverified
2AWRAverage Return229Unverified
#ModelMetricClaimedVerifiedStatus
1Orthogonal decision treeAverage Return-101.72Unverified
2Oblique decision treeAverage Return-106.02Unverified
#ModelMetricClaimedVerifiedStatus
1TLA with Hierarchical Reward FunctionsMean Reward-125.02Unverified
2TLAMean Reward-154.92Unverified
#ModelMetricClaimedVerifiedStatus
1AWRMean Reward5,813Unverified
2TLAMean Reward3,878.41Unverified
#ModelMetricClaimedVerifiedStatus
1AWRAverage Return4,996Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward9,356.67Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward1,000Unverified
#ModelMetricClaimedVerifiedStatus
1TLAMean Reward93.88Unverified