SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 45014550 of 15113 papers

TitleStatusHype
One-shot, Offline and Production-Scalable PID Optimisation with Deep Reinforcement Learning0
Symbolic Distillation for Learned TCP Congestion ControlCode1
MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer SamplingCode0
OSS Mentor A framework for improving developers contributions via deep reinforcement learning0
Opportunistic Episodic Reinforcement Learning0
Understanding the Evolution of Linear Regions in Deep Reinforcement LearningCode0
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook0
Energy Pricing in P2P Energy Systems Using Reinforcement LearningCode1
Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing0
Causal Explanation for Reinforcement Learning: Quantifying State and Temporal Importance0
Hardness in Markov Decision Processes: Theory and Practice0
ADLight: A Universal Approach of Traffic Signal Control with Augmented Data Using Reinforcement LearningCode1
Classifying Ambiguous Identities in Hidden-Role Stochastic Games with Multi-Agent Reinforcement LearningCode0
AACHER: Assorted Actor-Critic Deep Reinforcement Learning with Hindsight Experience ReplayCode0
Avalon: A Benchmark for RL Generalization Using Procedurally Generated WorldsCode1
Dichotomy of Control: Separating What You Can Control from What You Cannot0
Evaluating Long-Term Memory in 3D MazesCode1
Multi-Agent Path Finding via Tree LSTMCode1
Reachability-Aware Laplacian Representation in Reinforcement Learning0
Climate Change Policy Exploration using Reinforcement Learning0
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning0
LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation0
A Cooperative Reinforcement Learning Environment for Detecting and Penalizing Betrayal0
Learning General World Models in a Handful of Reward-Free Deployments0
MetaEMS: A Meta Reinforcement Learning-based Control Framework for Building Energy Management System0
Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning0
Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES0
Probing Transfer in Deep Reinforcement Learning without Task Engineering0
Towards Quantum-Enabled 6G Slicing0
Rate-Splitting for Intelligent Reflecting Surface-Aided Multiuser VR StreamingCode0
Epistemic Monte Carlo Tree Search0
On the connection between Bregman divergence and value in regularized Markov decision processes0
Implicit Offline Reinforcement Learning via Supervised Learning0
Continual Vision-based Reinforcement Learning with Group Symmetries0
Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All NetworksCode0
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables0
Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks0
Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities0
Deep Reinforcement Learning for Inverse Inorganic Materials Design0
Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents0
PaCo: Parameter-Compositional Multi-Task Reinforcement LearningCode1
Fine-Grained Session Recommendations in E-commerce using Deep Reinforcement Learning0
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes0
Robust Imitation via Mirror Descent Inverse Reinforcement Learning0
Model-based Lifelong Reinforcement Learning with Bayesian ExplorationCode0
MoCoDA: Model-based Counterfactual Data AugmentationCode1
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
Safe Policy Improvement in Constrained Markov Decision Processes0
Task Phasing: Automated Curriculum Learning from DemonstrationsCode0
RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator ControlCode1
Show:102550
← PrevPage 91 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified