SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 96019650 of 15113 papers

TitleStatusHype
Learning to Deceive Knowledge Graph Augmented Models via Targeted PerturbationCode0
Planning with Exploration: Addressing Dynamics Bottleneck in Model-based Reinforcement Learning0
Stochastic Inverse Reinforcement Learning0
Stabilizing Transformer-Based Action Sequence Generation For Q-Learning0
Towards Safe Policy Improvement for Non-Stationary MDPsCode0
Learning Guidance Rewards with Trajectory-space SmoothingCode1
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration0
Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement LearningCode1
Bridging Imagination and Reality for Model-Based Deep Reinforcement LearningCode1
Multi-UAV Path Planning for Wireless Data Harvesting with Deep Reinforcement LearningCode1
Option Hedging with Risk Averse Reinforcement Learning0
Optimizing Coverage and Capacity in Cellular Networks using Machine Learning0
Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning0
Reinforcement Learning with Combinatorial Actions: An Application to Vehicle RoutingCode1
Sample Efficient Reinforcement Learning with REINFORCE0
Adversarial Attacks on Deep Algorithmic Trading Policies0
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based GamesCode0
Incorporating Stylistic Lexical Preferences in Generative Language Models0
Detecting Rewards Deterioration in Episodic Reinforcement LearningCode0
Batch Exploration with Examples for Scalable Robotic Reinforcement LearningCode1
Error Bounds of Imitating Policies and Environments0
CoinDICE: Off-Policy Confidence Interval Estimation0
Accelerating Reinforcement Learning with Learned Skill PriorsCode1
What are the Statistical Limits of Offline RL with Linear Function Approximation?0
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments0
Visual Navigation in Real-World Indoor Environments Using End-to-End Deep Reinforcement LearningCode1
Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets0
Logistic Q-Learning0
Safety Verification of Model Based Reinforcement Learning Controllers0
PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text GenerationCode1
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality0
Improving Generalization in Reinforcement Learning with Mixture RegularizationCode1
Correlation-aware Cooperative Multigroup Broadcast 360° Video Delivery Network: A Hierarchical Deep Reinforcement Learning ApproachCode1
Multi-Radar Tracking Optimization for Collaborative Combat0
Quality of service based radar resource management using deep reinforcement learning0
Runtime Safety Assurance Using Reinforcement Learning0
Negotiating Team Formation Using Deep Reinforcement Learning0
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification0
Reinforcement Learning for Optimization of COVID-19 Mitigation policiesCode1
Iterative Amortized Policy OptimizationCode1
Integrating LEO Satellites and Multi-UAV Reinforcement Learning for Hybrid FSO/RF Non-Terrestrial Networks0
Deep Reinforcement Learning in Lane Merge Coordination for Connected Vehicles0
Language Inference with Multi-head Automata through Reinforcement Learning0
A case for new neural networks smoothness constraints0
Dream and Search to Control: Latent Space Planning for Continuous ControlCode1
Learning by Competition of Self-Interested Reinforcement Learning AgentsCode0
Imitation with Neural Density Models0
Connections between Relational Event Model and Inverse Reinforcement Learning for Characterizing Group Interaction SequencesCode2
SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous DrivingCode2
Model-based Policy Optimization with Unsupervised Model AdaptationCode1
Show:102550
← PrevPage 193 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified