SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 24262450 of 15113 papers

TitleStatusHype
Learning Explicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning via Polarization Policy GradientCode0
Learning Curriculum Policies for Reinforcement LearningCode0
Learning Complex Teamwork Tasks Using a Given Sub-task DecompositionCode0
A view on learning robust goal-conditioned value functions: Interplay between RL and MPCCode0
Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language ModelsCode0
Learning data augmentation policies using augmented random searchCode0
Learning by Playing - Solving Sparse Reward Tasks from ScratchCode0
Learning Approximate Stochastic Transition ModelsCode0
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural NetworksCode0
A dynamical clipping approach with task feedback for Proximal Policy OptimizationCode0
Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiencyCode0
Learning-based Model Predictive Control for Safe Exploration and Reinforcement LearningCode0
Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement LearningCode0
Learning Actionable Representations with Goal-Conditioned PoliciesCode0
Learning Action-Transferable Policy with Action EmbeddingCode0
Learning a model is paramount for sample efficiency in reinforcement learning control of PDEsCode0
Learning Bellman Complete Representations for Offline Policy EvaluationCode0
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement LearningCode0
Adaptive Power System Emergency Control using Deep Reinforcement LearningCode0
LatentPoison - Adversarial Attacks On The Latent SpaceCode0
Latent Guided Sampling for Combinatorial OptimizationCode0
Latent Intention Dialogue ModelsCode0
Large Language Model-Driven Curriculum Design for Mobile NetworksCode0
Large Language Models are Autonomous Cyber DefendersCode0
Language Understanding for Text-based Games Using Deep Reinforcement LearningCode0
Show:102550
← PrevPage 98 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified