SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 43764400 of 15113 papers

TitleStatusHype
Reinforcement Learning based Collective Entity Alignment with Adaptive FeaturesCode0
Playing FPS Games with Deep Reinforcement LearningCode0
Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement LearningCode0
Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learningCode0
Near Optimal Behavior via Approximate State AbstractionCode0
Playing Text-Adventure Games with Graph-Based Deep Reinforcement LearningCode0
Sample Complexity of Robust Reinforcement Learning with a Generative ModelCode0
Near-optimal Deep Reinforcement Learning Policies from Data for Zone Temperature ControlCode0
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning ApproachCode0
PlotMap: Automated Layout Design for Building Game WorldsCode0
Contrastive Multi-document Question GenerationCode0
Sample-Efficient Deep Reinforcement Learning via Episodic Backward UpdateCode0
Meta-Reinforcement Learning in Broad and Non-Parametric EnvironmentsCode0
POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement LearningCode0
Unsupervised Learning for Robust Fitting:A Reinforcement Learning ApproachCode0
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy CriticsCode0
Sample Efficient Model-free Reinforcement Learning from LTL Specifications with Optimality GuaranteesCode0
Unsupervised multi-latent space reinforcement learning framework for video summarization in ultrasound imagingCode0
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question GenerationCode0
Unsupervised Predictive Memory in a Goal-Directed AgentCode0
Sample Efficient Policy Gradient Methods with Recursive Variance ReductionCode0
MCTS-GEB: Monte Carlo Tree Search is a Good E-graph BuilderCode0
Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic ProgrammingCode0
Unsupervised Reinforcement Adaptation for Class-Imbalanced Text ClassificationCode0
Unsupervised Reinforcement Learning in Multiple EnvironmentsCode0
Show:102550
← PrevPage 176 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified