SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1405114100 of 15113 papers

TitleStatusHype
Deep Reinforcement Learning for Tactile Robotics: Learning to Type on a Braille KeyboardCode0
Adaptive coordination of working-memory and reinforcement learning in non-human primates performing a trial-and-error problem solving taskCode0
Deep Reinforcement Learning for Swarm SystemsCode0
Learning to Drive in a DayCode0
Computational Benefits of Intermediate Rewards for Goal-Reaching Policy LearningCode0
Deep Reinforcement Learning for Surgical Gesture Segmentation and ClassificationCode0
Distinguishing Learning Rules with Brain Machine InterfacesCode0
ALBA : Reinforcement Learning for Video Object SegmentationCode0
Compositional Learning of Visually-Grounded Concepts Using ReinforcementCode0
A Laplacian Framework for Option Discovery in Reinforcement LearningCode0
Compositional Conservatism: A Transductive Approach in Offline Reinforcement LearningCode0
Deep reinforcement learning for smart calibration of radio telescopesCode0
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement LearningCode0
Large Language Model-Driven Curriculum Design for Mobile NetworksCode0
Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous EnvironmentsCode0
Composable Deep Reinforcement Learning for Robotic ManipulationCode0
How Are Learned Perception-Based Controllers Impacted by the Limits of Robust Control?Code0
Learning model-based strategies in simple environments with hierarchical q-networksCode0
Automated Optical Multi-layer Design via Deep Reinforcement LearningCode0
Automated Image Data Preprocessing with Deep Reinforcement LearningCode0
A Kernel Loss for Solving the Bellman EquationCode0
Feudal Graph Reinforcement LearningCode0
Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutesCode0
Distributed Distributional Deterministic Policy GradientsCode0
Answers Unite! Unsupervised Metrics for Reinforced Summarization ModelsCode0
Large Language Models are Autonomous Cyber DefendersCode0
FeUdal Networks for Hierarchical Reinforcement LearningCode0
Deep Reinforcement Learning for Sepsis TreatmentCode0
Large Language Models are Biased Reinforcement LearnersCode0
A Novel Update Mechanism for Q-Networks Based On Extreme Learning MachinesCode0
Automated Gadget Discovery in ScienceCode0
Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RLCode0
Complex Model Transformations by Reinforcement Learning with Uncertain Human GuidanceCode0
How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation?Code0
Deep Reinforcement Learning for Programming Language CorrectionCode0
Few-Shot Image-to-Semantics Translation for Policy Transfer in Reinforcement LearningCode0
How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning ExperimentsCode0
Deep Reinforcement Learning for Long-Short Portfolio OptimizationCode0
Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization ApproachCode0
A novel policy for pre-trained Deep Reinforcement Learning for Speech Emotion RecognitionCode0
A Joint Imitation-Reinforcement Learning Framework for Reduced Baseline RegretCode0
Interactive Learning from Activity DescriptionCode0
How Private Is Your RL Policy? An Inverse RL Based Analysis FrameworkCode0
Multi-Agent Reinforcement Learning in Stochastic Networked SystemsCode0
Deep Reinforcement Learning for Playing 2.5D Fighting GamesCode0
Few-shot Quality-Diversity OptimizationCode0
How RL Agents Behave When Their Actions Are ModifiedCode0
Deep Reinforcement Learning for Optimal Stopping with Application in Financial EngineeringCode0
Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge DistillationCode0
FFNet: Video Fast-Forwarding via Reinforcement LearningCode0
Show:102550
← PrevPage 282 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified