SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 77517800 of 15113 papers

TitleStatusHype
Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic SystemsCode1
MODRL/D-EL: Multiobjective Deep Reinforcement Learning with Evolutionary Learning for Multiobjective Optimization0
Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets0
Reinforcement Learning for Education: Opportunities and Challenges0
Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi0
Deep Reinforcement Learning based Dynamic Optimization of Bus Timetable0
High-level Decisions from a Safe Maneuver Catalog with Reinforcement Learning for Safe and Cooperative Automated Merging0
A Reinforcement Learning Environment for Mathematical Reasoning via Program SynthesisCode1
NeuSaver: Neural Adaptive Power Consumption Optimization for Mobile Video Streaming0
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning0
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning0
PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided ExplorationCode0
Safer Reinforcement Learning through Transferable Instinct NetworksCode0
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks0
QoS-Aware Scheduling in New Radio Using Deep Reinforcement Learning0
Mixing Human Demonstrations with Self-Exploration in Experience Replay for Deep Reinforcement Learning0
Model-free Reinforcement Learning for Robust Locomotion using Demonstrations from Trajectory Optimization0
Surgical Instruction Generation with TransformersCode1
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot0
Centralized Model and Exploration Policy for Multi-Agent RLCode0
Experimental Evidence that Empowerment May Drive Exploration in Sparse-Reward Environments0
Going Beyond Linear RL: Sample Efficient Neural Function Approximation0
Deep Adaptive Multi-Intention Inverse Reinforcement LearningCode0
Carle's Game: An Open-Ended Challenge in Exploratory Machine CreativityCode0
Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning0
ReLLIE: Deep Reinforcement Learning for Customized Low-Light Image EnhancementCode1
Model Selection for Generic Reinforcement Learning0
Shortest-Path Constrained Reinforcement Learning for Sparse Reward TasksCode1
Teaching Agents how to Map: Spatial Reasoning for Multi-Object NavigationCode1
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage0
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability0
A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization0
Conservative Offline Distributional Reinforcement LearningCode1
Explore and Control with Adversarial SurpriseCode1
Reinforcement Learning based Proactive Control for Transmission Grid Resilience to Wildfire0
The Role of Pretrained Representations for the OOD Generalization of Reinforcement Learning Agents0
Modeling Explicit Concerning States for Reinforcement Learning in Visual DialogueCode0
Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions0
Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph DrawingCode1
R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery0
CoBERL: Contrastive BERT for Reinforcement Learning0
Behavior Constraining in Weight Space for Offline Reinforcement Learning0
A Simple Reward-free Approach to Constrained Reinforcement Learning0
Generating stable molecules using imitation and reinforcement learning0
Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and ResultsCode1
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
Distributed Deep Reinforcement Learning for Intelligent Traffic Monitoring with a Team of Aerial Robots0
LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Sparse Reward Iterative TasksCode0
NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning0
Reinforced Hybrid Genetic Algorithm for the Traveling Salesman Problem0
Show:102550
← PrevPage 156 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified