SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 75017550 of 15113 papers

TitleStatusHype
Data Generation Method for Learning a Low-dimensional Safe Region in Safe Reinforcement Learning0
Deep Reinforcement Learning for Equal Risk Pricing and Hedging under Dynamic Expectile Risk Measures0
Incentivizing an Unknown Crowd0
TimeTraveler: Reinforcement Learning for Temporal Knowledge Graph ForecastingCode1
OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution MatchingCode0
User Tampering in Reinforcement Learning Recommender Systems0
Self-supervised Reinforcement Learning with Independently Controllable Subgoals0
PowerGym: A Reinforcement Learning Environment for Volt-Var Control in Power Distribution SystemsCode1
Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning0
A Deep Reinforcement Learning Approach for Online Parcel Assignment0
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent0
A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions0
Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning0
Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning0
CyGIL: A Cyber Gym for Training Autonomous Agents over Emulated Network Systems0
Robust Predictable Control0
Safety-Critical Learning of Robot Control with Temporal Logic Specifications0
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement LearningCode2
Optimizing Quantum Variational Circuits with Deep Reinforcement LearningCode1
On the impact of MDP design for Reinforcement Learning agents in Resource Management0
Deep SIMBAD: Active Landmark-based Self-localization Using Ranking -based Scene Descriptor0
Delving into Macro Placement with Reinforcement Learning0
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented GuesserCode0
Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning0
Guiding Global Placement With Reinforcement Learning0
Method for making multi-attribute decisions in wargames by combining intuitionistic fuzzy numbers with reinforcement learning0
Recommendation Fairness: From Static to Dynamic0
Temporal Shift Reinforcement LearningCode0
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games0
Eden: A Unified Environment Framework for Booming Reinforcement Learning Algorithms0
Provably Safe Model-Based Meta Reinforcement Learning: An Abstraction-Based Approach0
Multi-agent Natural Actor-critic Reinforcement Learning Algorithms0
Unsupervised multi-latent space reinforcement learning framework for video summarization in ultrasound imagingCode0
Reinforcement Learning for Battery Energy Storage Dispatch augmented with Model-based Optimizer0
Self-timed Reinforcement Learning using Tsetlin Machine0
Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution Concepts0
A Comparative Study of Algorithms for Intelligent Traffic Signal ControlCode2
An Oracle and Observations for the OpenAI Gym / ALE Freeway Environment0
A Survey of Exploration Methods in Reinforcement Learning0
Boosting Search Engines with Interactive Agents0
Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge DistillationCode0
OptAGAN: Entropy-based finetuning on text VAE-GANCode0
Variational Quantum Reinforcement Learning via Evolutionary Optimization0
Informing Autonomous Deception Systems with Cyber Expert Performance Data0
Incorporating Deception into CyberBattleSim for Autonomous Defense0
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPUCode1
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive SummarizationCode1
Deep Reinforcement Learning at the Edge of the Statistical PrecipiceCode1
Investigating Vulnerabilities of Deep Neural Policies0
Adaptive perturbation adversarial training: based on reinforcement learning0
Show:102550
← PrevPage 151 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified