SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 801825 of 15113 papers

TitleStatusHype
C-COMA: A CONTINUAL REINFORCEMENT LEARNING MODEL FOR DYNAMIC MULTIAGENT ENVIRONMENTSCode1
CDT: Cascading Decision Trees for Explainable Reinforcement LearningCode1
Cell-Free Latent Go-ExploreCode1
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement LearningCode1
DGPO: Discovering Multiple Strategies with Diversity-Guided Policy OptimizationCode1
Large Language Models are Learnable Planners for Long-Term RecommendationCode1
CFR-RL: Traffic Engineering with Reinforcement Learning in SDNCode1
Challenges for Reinforcement Learning in Quantum Circuit DesignCode1
Attention Actor-Critic algorithm for Multi-Agent Constrained Co-operative Reinforcement LearningCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement LearningCode1
Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority InfluenceCode1
Entity-Centric Reinforcement Learning for Object Manipulation from PixelsCode1
Attacking Video Recognition Models with Bullet-Screen CommentsCode1
Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot LearningCode1
Challenges of Real-World Reinforcement LearningCode1
Character Controllers Using Motion VAEsCode1
Design Process is a Reinforcement Learning ProblemCode1
A Traffic Light Dynamic Control Algorithm with Deep Reinforcement Learning Based on GNN PredictionCode1
Chip Placement with Diffusion ModelsCode1
Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning SystemsCode1
CIC: Contrastive Intrinsic Control for Unsupervised Skill DiscoveryCode1
Evaluating Long-Term Memory in 3D MazesCode1
Evaluating Soccer Player: from Live Camera to Deep Reinforcement LearningCode1
Active Inference for Stochastic ControlCode1
Show:102550
← PrevPage 33 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified