SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1017610200 of 15113 papers

TitleStatusHype
Option Hedging with Risk Averse Reinforcement Learning0
Stabilizing Transformer-Based Action Sequence Generation For Q-Learning0
Towards Safe Policy Improvement for Non-Stationary MDPsCode0
Stochastic Inverse Reinforcement Learning0
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration0
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based GamesCode0
Adversarial Attacks on Deep Algorithmic Trading Policies0
Detecting Rewards Deterioration in Episodic Reinforcement LearningCode0
CoinDICE: Off-Policy Confidence Interval Estimation0
Error Bounds of Imitating Policies and Environments0
Incorporating Stylistic Lexical Preferences in Generative Language Models0
Sample Efficient Reinforcement Learning with REINFORCE0
What are the Statistical Limits of Offline RL with Linear Function Approximation?0
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments0
Optimizing Coverage and Capacity in Cellular Networks using Machine Learning0
Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning0
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality0
Safety Verification of Model Based Reinforcement Learning Controllers0
Logistic Q-Learning0
Deep Reinforcement Learning in Lane Merge Coordination for Connected Vehicles0
Language Inference with Multi-head Automata through Reinforcement Learning0
Integrating LEO Satellites and Multi-UAV Reinforcement Learning for Hybrid FSO/RF Non-Terrestrial Networks0
Runtime Safety Assurance Using Reinforcement Learning0
Multi-Radar Tracking Optimization for Collaborative Combat0
Show:102550
← PrevPage 408 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified