SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 24762500 of 15113 papers

TitleStatusHype
Latent Intention Dialogue ModelsCode0
A Machine with Short-Term, Episodic, and Semantic Memory SystemsCode0
Latent Guided Sampling for Combinatorial OptimizationCode0
LatentPoison - Adversarial Attacks On The Latent SpaceCode0
Large Language Models are Autonomous Cyber DefendersCode0
Large Language Models are Biased Reinforcement LearnersCode0
Autonomous Option Invention for Continual Hierarchical Reinforcement Learning and PlanningCode0
Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes using Transfer LearningCode0
Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement LearningCode0
Learning a model is paramount for sample efficiency in reinforcement learning control of PDEsCode0
Language Understanding for Text-based Games Using Deep Reinforcement LearningCode0
Autonomous Management of Energy-Harvesting IoT Nodes Using Deep Reinforcement LearningCode0
Language Model Alignment with Elastic ResetCode0
Language as an Abstraction for Hierarchical Deep Reinforcement LearningCode0
Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team CompetitionCode0
Langevin DQNCode0
A Lyapunov-based Approach to Safe Reinforcement LearningCode0
LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient QueryingCode0
Laboratory Experiments of Model-based Reinforcement Learning for Adaptive Optics ControlCode0
L2SR: Learning to Sample and Reconstruct for Accelerated MRI via Reinforcement LearningCode0
Koopman Spectrum Nonlinear Regulators and Efficient Online LearningCode0
L2Explorer: A Lifelong Reinforcement Learning Assessment EnvironmentCode0
Large Language Model-Driven Curriculum Design for Mobile NetworksCode0
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural NetworksCode0
Autonomous Braking System via Deep Reinforcement LearningCode0
Show:102550
← PrevPage 100 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified