SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 99519975 of 15113 papers

TitleStatusHype
Reinforcement Learning-based Black-Box Evasion Attacks to Link Prediction in Dynamic Graphs0
Flightmare: A Flexible Quadrotor SimulatorCode2
Data-driven Outer-Loop Control Using Deep Reinforcement Learning for Trajectory Tracking0
Beyond variance reduction: Understanding the true impact of baselines on policy optimization0
Control of a Nature-inspired Scorpion using Reinforcement Learning0
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL0
Ranking Policy DecisionsCode0
Deep Reinforcement Learning for Contact-Rich Skills Using Compliant Movement Primitives0
Human-in-the-Loop Methods for Data-Driven and Reinforcement Learning Systems0
How does the structure embedded in learning policy affect learning quadruped locomotion?0
Reinforcement Learning with Feedback-modulated TD-STDP0
Real-world Video Adaptation with Reinforcement Learning0
Sample Efficiency in Sparse Reinforcement Learning: Or Your Money Back0
Meta Reinforcement Learning-Based Lane Change Strategy for Autonomous Vehicles0
On the model-based stochastic value gradient for continuous reinforcement learningCode1
Query Focused Multi-document Summarisation of Biomedical Texts: Macquarie Universiy and the Australian National University at BioASQ8bCode0
Market-making with reinforcement-learning (SAC)Code1
Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning0
AutoFS: Automated Feature Selection via Diversity-aware Interactive Reinforcement LearningCode0
Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI0
The Advantage Regret-Matching Actor-Critic0
Query Focused Multi-document Summarisation of Biomedical TextsCode0
Selective Particle Attention: Visual Feature-Based Attention in Deep Reinforcement Learning0
Synthetic Sample Selection via Reinforcement Learning0
Identifying Critical States by the Action-Based Variance of Expected Return0
Show:102550
← PrevPage 399 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified