SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 35763600 of 15113 papers

TitleStatusHype
Deep Reinforcement Learning for Autonomous DrivingCode0
Global and Local Analysis of Interestingness for Competency-Aware Deep Reinforcement LearningCode0
Posterior Sampling for Reinforcement Learning Without EpisodesCode0
GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal BabblingCode0
Goal-conditioned Imitation LearningCode0
GHQ: Grouped Hybrid Q Learning for Heterogeneous Cooperative Multi-agent Reinforcement LearningCode0
Deep Reinforcement Learning for Chinese Zero pronoun ResolutionCode0
APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement LearningCode0
Gifting in multi-agent reinforcement learningCode0
Learning robust control for LQR systems with multiplicative noise via policy gradientCode0
Collision Avoidance Robotics Via Meta-Learning (CARML)Code0
Collision Avoidance in Pedestrian-Rich Environments with Deep Reinforcement LearningCode0
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RLCode0
GFlowNets and variational inferenceCode0
GFlowNet Training by Policy GradientsCode0
"Give Me an Example Like This": Episodic Active Reinforcement Learning from DemonstrationsCode0
A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning AgentsCode0
Goal-Conditioned Q-Learning as Knowledge DistillationCode0
Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement LearningCode0
Generative Adversarial User Model for Reinforcement Learning Based Recommendation SystemCode0
A Practical Guide to Multi-Objective Reinforcement Learning and PlanningCode0
Generative Planning for Temporally Coordinated Exploration in Reinforcement LearningCode0
Deep Reinforcement Learning for Cybersecurity Assessment of Wind Integrated Power SystemsCode0
Generating Multi-type Temporal Sequences to Mitigate Class-imbalanced ProblemCode0
Generative Adversarial Network for Abstractive Text SummarizationCode0
Show:102550
← PrevPage 144 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified