SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 83018350 of 15113 papers

TitleStatusHype
^2-exploration for Reinforcement Learning0
Superior Performance with Diversified Strategic Control in FPS Games Using General Reinforcement Learning0
Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)0
Polyphonic Music Composition: An Adversarial Inverse Reinforcement Learning Approach0
Uncertainty Regularized Policy Learning for Offline Reinforcement Learning0
Understanding and Leveraging Overparameterization in Recursive Value Estimation0
Understanding the Generalization Gap in Visual Reinforcement Learning0
Reasoning With Hierarchical Symbols: Reclaiming Symbolic Policies For Visual Reinforcement Learning0
Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning0
Metrics Matter: A Closer Look on Self-Paced Reinforcement Learning0
MURO: Deployment Constrained Reinforcement Learning with Model-based Uncertainty Regularized Batch Optimization0
Untangling Braids with Multi-agent Q-Learning0
Pretraining for Language Conditioned Imitation with Transformers0
Online Robust Reinforcement Learning with Model Uncertainty0
Meta Attention For Off-Policy Actor-Critic0
SAFER: Data-Efficient and Safe Reinforcement Learning Through Skill Acquisition0
Value Refinement Network (VRN)0
PDQN - A Deep Reinforcement Learning Method for Planning with Long Delays: Optimization of Manufacturing Dispatching0
Variational oracle guiding for reinforcement learning0
Online Tuning for Offline Decentralized Multi-Agent Reinforcement Learning0
Reinforcement Learning State Estimation for High-Dimensional Nonlinear Systems0
Safe Exploration in Linear Equality Constraint0
State-Action Joint Regularized Implicit Policy for Offline Reinforcement Learning0
Reinforcement Learning with Predictive Consistent Representations0
Maximizing Ensemble Diversity in Deep Reinforcement Learning0
Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory0
Stability and Generalisation in Batch Reinforcement Learning0
MARNET: Backdoor Attacks against Value-Decomposition Multi-Agent Reinforcement Learning0
WaveCorr: Deep Reinforcement Learning with Permutation Invariant Policy Networks for Portfolio Management0
On Reward Maximization and Distribution Matching for Fine-Tuning Language Models0
Weakly-Supervised Learning of Disentangled and Interpretable Skills for Hierarchical Reinforcement Learning0
SPP-RL: State Planning Policy Reinforcement Learning0
On the benefits of deep RL in accelerated MRI sampling0
Pareto Policy Adaptation0
Nested Policy Reinforcement Learning for Clinical Decision Support0
Pareto Policy Pool for Model-based Offline Reinforcement Learning0
Programmatic Reinforcement Learning without Oracles0
Neural Combinatorial Optimization with Reinforcement Learning : Solving theVehicle Routing Problem with Time Windows0
Particle Based Stochastic Policy Optimization0
Why so pessimistic? Estimating uncertainties for offline RL through ensembles, and why their independence matters.0
Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics0
Reinforcement Learning with Ex-Post Max-Min Fairness0
Zero-Shot Reward Specification via Grounded Natural Language0
LPMARL: Linear Programming based Implicit Task Assigment for Hiearchical Multi-Agent Reinforcement Learning0
Making Curiosity Explicit in Vision-based RL0
Reinforcement Learning for Quantitative Trading0
A First-Occupancy Representation for Reinforcement Learning0
Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey0
Longitudinal Deep Truck: Deep learning and deep reinforcement learning for modeling and control of longitudinal dynamics of heavy duty trucks0
Adaptive Informative Path Planning Using Deep Reinforcement Learning for UAV-based Active Sensing0
Show:102550
← PrevPage 167 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified