SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1015110200 of 15113 papers

TitleStatusHype
Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning0
Multi-Agent Safe Policy Learning for Power Management of Networked Microgrids0
Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?0
Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient0
Conservative Safety Critics for Exploration0
Learning to be Safe: Deep RL with a Safety Critic0
Affordance as general value function: A computational model0
Learning Time Reduction Using Warm Start Methods for a Reinforcement Learning Based Supervisory Control in Hybrid Electric Vehicle Applications0
RH-Net: Improving Neural Relation Extraction via Reinforcement Learning and Hierarchical Relational SearchingCode0
Behavior Priors for Efficient Reinforcement Learning0
Forethought and Hindsight in Credit Assignment0
Behavioral decision-making for urban autonomous driving in the presence of pedestrians using Deep Recurrent Q-Network0
High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards0
Contextual Latent-Movements Off-Policy Optimization for Robotic Manipulation Skills0
VisualHints: A Visual-Lingual Environment for Multimodal Reinforcement Learning0
Pairwise heuristic sequence alignment algorithm based on deep reinforcement learning0
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning0
Lyapunov-Based Reinforcement Learning State Estimator0
Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision0
Adaptive Federated Learning and Digital Twin for Industrial Internet of Things0
How to Make Deep RL Work in PracticeCode0
Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control0
Learning to Deceive Knowledge Graph Augmented Models via Targeted PerturbationCode0
Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search0
Planning with Exploration: Addressing Dynamics Bottleneck in Model-based Reinforcement Learning0
Option Hedging with Risk Averse Reinforcement Learning0
Stabilizing Transformer-Based Action Sequence Generation For Q-Learning0
Towards Safe Policy Improvement for Non-Stationary MDPsCode0
Stochastic Inverse Reinforcement Learning0
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration0
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based GamesCode0
Adversarial Attacks on Deep Algorithmic Trading Policies0
Detecting Rewards Deterioration in Episodic Reinforcement LearningCode0
CoinDICE: Off-Policy Confidence Interval Estimation0
Error Bounds of Imitating Policies and Environments0
Incorporating Stylistic Lexical Preferences in Generative Language Models0
Sample Efficient Reinforcement Learning with REINFORCE0
What are the Statistical Limits of Offline RL with Linear Function Approximation?0
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments0
Optimizing Coverage and Capacity in Cellular Networks using Machine Learning0
Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning0
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality0
Safety Verification of Model Based Reinforcement Learning Controllers0
Logistic Q-Learning0
Deep Reinforcement Learning in Lane Merge Coordination for Connected Vehicles0
Language Inference with Multi-head Automata through Reinforcement Learning0
Integrating LEO Satellites and Multi-UAV Reinforcement Learning for Hybrid FSO/RF Non-Terrestrial Networks0
Runtime Safety Assurance Using Reinforcement Learning0
Multi-Radar Tracking Optimization for Collaborative Combat0
Show:102550
← PrevPage 204 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified