SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 59516000 of 15113 papers

TitleStatusHype
Understanding the Evolution of Linear Regions in Deep Reinforcement LearningCode0
OSS Mentor A framework for improving developers contributions via deep reinforcement learning0
MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer SamplingCode0
Reachability-Aware Laplacian Representation in Reinforcement Learning0
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook0
Opportunistic Episodic Reinforcement Learning0
MetaEMS: A Meta Reinforcement Learning-based Control Framework for Building Energy Management System0
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning0
Learning General World Models in a Handful of Reward-Free Deployments0
LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation0
A Cooperative Reinforcement Learning Environment for Detecting and Penalizing Betrayal0
Climate Change Policy Exploration using Reinforcement Learning0
Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES0
Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning0
Probing Transfer in Deep Reinforcement Learning without Task Engineering0
Epistemic Monte Carlo Tree Search0
Towards Quantum-Enabled 6G Slicing0
On the connection between Bregman divergence and value in regularized Markov decision processes0
Rate-Splitting for Intelligent Reflecting Surface-Aided Multiuser VR StreamingCode0
Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks0
Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities0
Continual Vision-based Reinforcement Learning with Group Symmetries0
Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents0
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables0
Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All NetworksCode0
Implicit Offline Reinforcement Learning via Supervised Learning0
Deep Reinforcement Learning for Inverse Inorganic Materials Design0
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes0
Fine-Grained Session Recommendations in E-commerce using Deep Reinforcement Learning0
Task Phasing: Automated Curriculum Learning from DemonstrationsCode0
Model-based Lifelong Reinforcement Learning with Bayesian ExplorationCode0
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement LearningCode0
Safe Policy Improvement in Constrained Markov Decision Processes0
Robust Imitation via Mirror Descent Inverse Reinforcement Learning0
Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes0
Scaling Laws for Reward Model Overoptimization0
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining0
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation0
Robotic Table Wiping via Reinforcement Learning and Whole-body Trajectory Optimization0
Robot Navigation with Reinforcement Learned Path Generation and Fine-Tuned Motion Control0
Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning0
When to Ask for Help: Proactive Interventions in Autonomous Reinforcement LearningCode0
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness0
Learning Preferences for Interactive AutonomyCode0
Integrated Decision and Control for High-Level Automated Vehicles by Mixed Policy Gradient and Its Experiment Verification0
A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design0
Hierarchical Reinforcement Learning for Furniture Layout in Virtual Indoor Scenes0
CLUTR: Curriculum Learning via Unsupervised Task Representation LearningCode0
CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with DemonstrationsCode0
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity0
Show:102550
← PrevPage 120 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified