SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 25262550 of 15113 papers

TitleStatusHype
Learning human behaviors from motion capture by adversarial imitationCode0
Learning to Perform Local Rewriting for Combinatorial OptimizationCode0
Automatic Goal Generation for Reinforcement Learning AgentsCode0
Kernel Density Bayesian Inverse Reinforcement LearningCode0
Combining imagination and heuristics to learn strategies that generalizeCode0
KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement LearningCode0
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL PoliciesCode0
Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic LocomotionCode0
Join Query Optimization with Deep Reinforcement Learning AlgorithmsCode0
Jointly Learning to Construct and Control Agents using Deep Reinforcement LearningCode0
Automatic Discovery of Interpretable Planning StrategiesCode0
Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement LearningCode0
Adaptive Gain Scheduling using Reinforcement Learning for Quadcopter ControlCode0
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of InterestingnessCode0
Iterative Reward Shaping using Human Feedback for Correcting Reward MisspecificationCode0
Aligning an optical interferometer with beam divergence control and continuous action spaceCode0
Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning for HanabiCode0
Automatically Exposing Problems with Neural Dialog ModelsCode0
Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement LearningCode0
ACING: Actor-Critic for Instruction Learning in Black-Box Large Language ModelsCode0
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?Code0
IRLAS: Inverse Reinforcement Learning for Architecture SearchCode0
A Lightweight Calibrated Simulation Enabling Efficient Offline Learning for Optimal Control of Real BuildingsCode0
Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic ControlCode0
Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing fieldCode0
Show:102550
← PrevPage 102 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified