SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 10011025 of 15113 papers

TitleStatusHype
Deep Active Inference for Partially Observable MDPsCode1
Inverse Reinforcement Learning without Reinforcement LearningCode1
IRanker: Towards Ranking Foundation ModelCode1
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?Code1
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy OptimizationCode1
DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous ControlCode1
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and PlanningCode1
A Boolean Task Algebra for Reinforcement LearningCode1
Deep Laplacian-based Options for Temporally-Extended ExplorationCode1
Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw PuzzlesCode1
Enhancement of a state-of-the-art RL-based detection algorithm for Massive MIMO radarsCode1
DeepFreight: Integrating Deep Reinforcement Learning and Mixed Integer Programming for Multi-transfer Truck Freight DeliveryCode1
Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal controlCode1
Deep Intrinsically Motivated Exploration in Continuous ControlCode1
Automatic Data Augmentation for Generalization in Deep Reinforcement LearningCode1
Deep Latent Competition: Learning to Race Using Visual Control Policies in Latent SpaceCode1
Adaptive Transformers in RLCode1
Analytical Lyapunov Function Discovery: An RL-based Generative ApproachCode1
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character SkillsCode1
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement LearningCode1
Analytic Manifold Learning: Unifying and Evaluating Representations for Continuous ControlCode1
DeepMind Control SuiteCode1
Automatic Data Augmentation for Generalization in Reinforcement LearningCode1
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter EfficientCode1
Automatic Curriculum Learning through Value DisagreementCode1
Show:102550
← PrevPage 41 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified