SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1125111275 of 15113 papers

TitleStatusHype
Gym-saturation: an OpenAI Gym environment for saturation provers0
H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic0
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps0
Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale0
HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning0
Hacking Google reCAPTCHA v3 using Reinforcement Learning0
HACTS: a Human-As-Copilot Teleoperation System for Robot Learning0
Halftoning with Multi-Agent Deep Reinforcement Learning0
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization0
Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models0
Hallucination-Aware Generative Pretrained Transformer for Cooperative Aerial Mobility Control0
Hamiltonian Policy Optimization0
Hamiltonian Policy Optimization in Reinforcement Learning0
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension0
Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time0
Handling Cold-Start Collaborative Filtering with Reinforcement Learning0
Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control0
Hand-Object Interaction Pretraining from Videos0
Hard instance learning for quantum adiabatic prime factorization0
Hardness in Markov Decision Processes: Theory and Practice0
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization0
Hardware Trojan Insertion Using Reinforcement Learning0
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks0
Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems0
Harnessing Causality in Reinforcement Learning With Bagged Decision Times0
Show:102550
← PrevPage 451 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified