SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1130111350 of 15113 papers

TitleStatusHype
Reward Shaping for Human Learning via Inverse Reinforcement LearningCode0
Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity0
Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic0
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning0
Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approachCode0
Optimizing Traffic Lights with Multi-agent Deep Reinforcement Learning and V2X communication0
Wireless 2.0: Towards an Intelligent Radio Environment Empowered by Reconfigurable Meta-Surfaces and Artificial Intelligence0
Near-optimal Regret Bounds for Stochastic Shortest Path0
Rapidly Personalizing Mobile Health Treatment Policies with Limited Data0
Deep Reinforcement Learning with Linear Quadratic Regulator Regions0
Automatic Data Augmentation via Deep Reinforcement Learning for Effective Kidney Tumor Segmentation0
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion0
Adversarial Radar Inference. From Inverse Tracking to Inverse Reinforcement Learning of Cognitive Radar0
Vehicle Tracking in Wireless Sensor Networks via Deep Reinforcement Learning0
On the Search for Feedback in Reinforcement Learning0
Accelerating Reinforcement Learning with a Directional-Gaussian-Smoothing Evolution Strategy0
Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep Reinforcement Learning Approach0
Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning0
Adaptive Temporal Difference Learning with Linear Function Approximation0
Automatic Gesture Recognition in Robot-assisted Surgery with Reinforcement Learning and Tree Search0
Enhanced Adversarial Strategically-Timed Attacks against Deep Reinforcement Learning0
Multi-Agent Reinforcement Learning as a Computational Tool for Language Evolution Research: Historical Context and Future Challenges0
oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions0
Debiased Off-Policy Evaluation for Recommendation Systems0
Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable Edge Computing Systems0
UAV Aided Search and Rescue Operation Using Reinforcement Learning0
Value-driven Hindsight Modelling0
Optimistic Policy Optimization with Bandit Feedback0
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning0
Efficient Deep Reinforcement Learning via Adaptive Policy TransferCode0
Curriculum in Gradient-Based Meta-Reinforcement Learning0
KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge0
Empirical Policy Evaluation with Supergraphs0
Adaptive Estimator Selection for Off-Policy EvaluationCode0
Multi-Issue Bargaining With Deep Reinforcement Learning0
MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding0
Reinforcement learning for the privacy preservation and manipulation of eye tracking data0
Reward Design for Driver Repositioning Using Multi-Agent Reinforcement Learning0
Langevin DQNCode0
Control Frequency Adaptation via Action Persistence in Batch Reinforcement LearningCode0
Adaptive Experience Selection for Policy Gradient0
Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning0
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling0
The Archimedean trap: Why traditional reinforcement learning will probably not yield AGI0
Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement LearningCode0
Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning0
Robust Reinforcement Learning via Adversarial training with Langevin DynamicsCode0
Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement LearningCode0
Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks0
Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic0
Show:102550
← PrevPage 227 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified