SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 30263050 of 15113 papers

TitleStatusHype
Harnessing Reinforcement Learning for Neural Motion PlanningCode0
Hyperbolic Discounting and Learning over Multiple HorizonsCode0
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement LearningCode0
Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learningCode0
Improving Post-Processing of Audio Event Detectors Using Reinforcement LearningCode0
Learning Principle of Least Action with Reinforcement LearningCode0
Learning from Multiple Independent Advisors in Multi-agent Reinforcement LearningCode0
Correcting Momentum in Temporal Difference LearningCode0
Correct Me If You Can: Learning from Error Corrections and MarkingsCode0
Human level control through deep reinforcement learningCode0
A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog SystemsCode0
Human-Inspired Framework to Accelerate Reinforcement LearningCode0
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement LearningCode0
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning ApproachCode0
HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile ManipulatorsCode0
Corruption-Robust Offline Reinforcement Learning with General Function ApproximationCode0
How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning AgentsCode0
HTMRL: Biologically Plausible Reinforcement Learning with Hierarchical Temporal MemoryCode0
Co-Speech Gesture Synthesis by Reinforcement Learning With Contrastive Pre-Trained RewardsCode0
A State-Distribution Matching Approach to Non-Episodic Reinforcement LearningCode0
Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning ApproachCode0
Reward Shaping for Human Learning via Inverse Reinforcement LearningCode0
Human-Level Control without Server-Grade HardwareCode0
Constructing Non-Markovian Decision Process via History AggregatorCode0
How to Build User Simulators to Train RL-based Dialog SystemsCode0
Show:102550
← PrevPage 122 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified