SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 77517800 of 15113 papers

TitleStatusHype
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning0
Guiding Robot Exploration in Reinforcement Learning via Automated Planning0
Guided Exploration in Deep Reinforcement Learning0
Guided Meta-Policy Search0
Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration0
Guided Policy Search Based Control of a High Dimensional Advanced Manufacturing Process0
Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation0
Guided Safe Shooting: model based reinforcement learning with safety constraints0
LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation0
Guiding Global Placement With Reinforcement Learning0
Guiding Reinforcement Learning Exploration Using Natural Language0
Guiding Representation Learning in Deep Generative Models with Policy Gradients0
Guiding Safe Exploration with Weakest Preconditions0
Gym-ANM: Open-source software to leverage reinforcement learning for power system management in research and education0
gym-DSSAT: a crop model turned into a Reinforcement Learning environment0
Gym-preCICE: Reinforcement Learning Environments for Active Flow Control0
Gym-saturation: an OpenAI Gym environment for saturation provers0
H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic0
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps0
Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale0
HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning0
Hacking Google reCAPTCHA v3 using Reinforcement Learning0
HACTS: a Human-As-Copilot Teleoperation System for Robot Learning0
Halftoning with Multi-Agent Deep Reinforcement Learning0
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization0
Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models0
Hallucination-Aware Generative Pretrained Transformer for Cooperative Aerial Mobility Control0
Hamiltonian Policy Optimization0
Hamiltonian Policy Optimization in Reinforcement Learning0
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension0
Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time0
Handling Cold-Start Collaborative Filtering with Reinforcement Learning0
Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control0
Hand-Object Interaction Pretraining from Videos0
Hard instance learning for quantum adiabatic prime factorization0
Hardness in Markov Decision Processes: Theory and Practice0
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization0
Hardware Trojan Insertion Using Reinforcement Learning0
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks0
Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems0
Harnessing Causality in Reinforcement Learning With Bagged Decision Times0
Harnessing Deep Q-Learning for Enhanced Statistical Arbitrage in High-Frequency Trading: A Comprehensive Exploration0
HARPO: Learning to Subvert Online Behavioral Advertising0
Harvesting energy from turbulent winds with Reinforcement Learning0
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents0
Hashing over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning0
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism0
HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving0
Hebbian Learning of Bayes Optimal Decisions0
Hebbian Synaptic Modifications in Spiking Neurons that Learn0
Show:102550
← PrevPage 156 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified