SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 46014625 of 15113 papers

TitleStatusHype
Neural Operator based Reinforcement Learning for Control of first-order PDEs with Spatially-Varying State DelayCode0
Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of ExperiencesCode0
Which Experiences Are Influential for Your Agent? Policy Iteration with Turn-over DropoutCode0
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control TasksCode0
Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL AgentsCode0
Neural Optimizer Search with Reinforcement LearningCode0
On the Perturbed States for Transformed Input-robust Reinforcement LearningCode0
Predictable Reinforcement Learning Dynamics through Entropy Rate MinimizationCode0
Sentence Simplification with Deep Reinforcement LearningCode0
Reinforcement Learning for Physical Layer CommunicationsCode0
Why People Skip Music? On Predicting Music Skips using Deep Reinforcement LearningCode0
Reinforcement Learning for Pivoting TaskCode0
Reinforcement Learning for Portfolio ManagementCode0
Is Policy Learning Overrated?: Width-Based Planning and Active Learning for AtariCode0
Mildly Constrained Evaluation Policy for Offline Reinforcement LearningCode0
Separating value functions across time-scalesCode0
WiNGPT-3.0 Technical ReportCode0
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation PoliciesCode0
SeqGAN: Sequence Generative Adversarial Nets with Policy GradientCode0
Sequence Adaptation via Reinforcement Learning in Recommender SystemsCode0
Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning ApproachCode0
Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement LearningCode0
WOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management StrategiesCode0
Predicting optimal value functions by interpolating reward functions in scalarized multi-objective reinforcement learningCode0
Reinforcement learning for Quantum Tiq-Taq-ToeCode0
Show:102550
← PrevPage 185 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified