SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 20512075 of 15113 papers

TitleStatusHype
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy OptimizationCode1
Bridging Imagination and Reality for Model-Based Deep Reinforcement LearningCode1
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement LearningCode1
Is Q-learning Provably Efficient?Code1
CaiRL: A High-Performance Reinforcement Learning Environment ToolkitCode1
Bridging RL Theory and Practice with the Effective HorizonCode1
IRanker: Towards Ranking Foundation ModelCode1
B-Pref: Benchmarking Preference-Based Reinforcement LearningCode1
An Application of Deep Reinforcement Learning to Algorithmic TradingCode1
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?Code1
Iterative Amortized Policy OptimizationCode1
Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement LearningCode1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
Inverse Constrained Reinforcement LearningCode1
BOME! Bilevel Optimization Made Easy: A Simple First-Order ApproachCode1
Intrinsic Reward Driven Imitation Learning via Generative ModelCode1
Interpretable End-to-end Urban Autonomous Driving with Latent Deep Reinforcement LearningCode1
Blue River Controls: A toolkit for Reinforcement Learning Control Systems on HardwareCode1
Optimization Methods for Interpretable Differentiable Decision Trees in Reinforcement LearningCode1
Intrusion Prevention through Optimal StoppingCode1
Inverse Reinforcement Learning without Reinforcement LearningCode1
Interactive Machine Learning of Musical GestureCode1
Interferobot: aligning an optical interferometer by a reinforcement learning agentCode1
AdaRL: What, Where, and How to Adapt in Transfer Reinforcement LearningCode1
Blockchain Framework for Artificial Intelligence ComputationCode1
Show:102550
← PrevPage 83 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified