SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1325113300 of 15113 papers

TitleStatusHype
Reinforcement Learning and Adaptive Sampling for Optimized DNN CompilationCode0
Modeling question asking using neural program generationCode0
Robust Reinforcement Learning via Adversarial training with Langevin DynamicsCode0
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain ClassifiersCode0
Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward ShapingCode0
Observe-R1: Unlocking Reasoning Abilities of MLLMs with Dynamic Progressive Reinforcement LearningCode0
Robust Reinforcement Learning with Dynamic Distortion Risk MeasuresCode0
Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with DistractionsCode0
Phrase-Level Action Reinforcement Learning for Neural Dialog Response GenerationCode0
Transfer Learning for Automated Test Case Prioritization Using XCSFCode0
Reinforcement Learning Agents in Colonel BlottoCode0
Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation ModelsCode0
Personalized Multimorbidity Management for Patients with Type 2 Diabetes Using Reinforcement Learning of Electronic Health RecordsCode0
Personalized Exercise Recommendation with Semantically-Grounded Knowledge TracingCode0
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement LearningCode0
Unsupervised Predictive Memory in a Goal-Directed AgentCode0
Reinforcement LearningCode0
Reinforcement Knowledge Graph Reasoning for Explainable RecommendationCode0
The State of Sparse Training in Deep Reinforcement LearningCode0
ROER: Regularized Optimal Experience ReplayCode0
Rogue-Gym: A New Challenge for Generalization in Reinforcement LearningCode0
Reinforcement and Imitation Learning for Diverse Visuomotor SkillsCode0
MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement LearningCode0
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender SystemsCode0
Reinforced Mnemonic Reader for Machine Reading ComprehensionCode0
Transfer Learning for Prosthetics Using Imitation LearningCode0
Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric DistanceCode0
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords LearningCode0
Performing Deep Recurrent Double Q-Learning for Atari GamesCode0
Reinforced Cross-modal Alignment for Radiology Report GenerationCode0
ROS2Learn: a reinforcement learning framework for ROS 2Code0
Reinforced Continual LearningCode0
Rotation, Translation, and Cropping for Zero-Shot GeneralizationCode0
Novel Policy Seeking with Constrained OptimizationCode0
ReinBo: Machine Learning pipeline search and configuration with Bayesian Optimization embedded Reinforcement LearningCode0
Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image TranslationCode0
Performative Reinforcement Learning in Gradually Shifting EnvironmentsCode0
Sim-Anchored Learning for On-the-Fly AdaptationCode0
Learning to Score Behaviors for Guided Policy OptimizationCode0
Regularizing Neural Networks for Future Trajectory Prediction via Inverse Reinforcement Learning FrameworkCode0
XCSF for Automatic Test Case PrioritizationCode0
RUDDER: Return Decomposition for Delayed RewardsCode0
Rule Augmented Unsupervised Constituency ParsingCode0
Transfer of Deep Reactive Policies for MDP PlanningCode0
Stable Policy Optimization via Off-Policy Divergence RegularizationCode0
Regularizing Neural Networks by Penalizing Confident Output DistributionsCode0
Perceiving the World: Question-guided Reinforcement Learning for Text-based GamesCode0
Unsupervised Reinforcement Adaptation for Class-Imbalanced Text ClassificationCode0
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement LearningCode0
Margin Trader: A Reinforcement Learning Framework for Portfolio Management with Margin and ConstraintsCode0
Show:102550
← PrevPage 266 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified