SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1380113850 of 15113 papers

TitleStatusHype
Dyna Planning using a Feature Based Generative Model0
Deep Reinforcement Learning of Marked Temporal Point ProcessesCode0
Discovering Blind Spots in Reinforcement Learning0
Reinforcement Learning for Heterogeneous Teams with PALO Bounds0
Scalable Coordinated Exploration in Concurrent Reinforcement LearningCode0
When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms0
Multi-task Maximum Entropy Inverse Reinforcement LearningCode0
Scalable Centralized Deep Multi-Agent Reinforcement Learning via Policy Gradients0
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied AgentsCode0
Data-Efficient Hierarchical Reinforcement LearningCode0
A General Family of Robust Stochastic Operators for Reinforcement Learning0
Evolution-Guided Policy Gradient in Reinforcement LearningCode0
Learning Safe Policies with Expert Guidance0
A Framework and Method for Online Inverse Reinforcement Learning0
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation0
Hierarchical Reinforcement Learning with Hindsight0
Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning0
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from BehaviorCode0
Unsupervised Video Object Segmentation for Deep Reinforcement LearningCode0
Machine Teaching for Inverse Reinforcement Learning: Algorithms and ApplicationsCode0
Constrained Policy Improvement for Safe and Efficient Reinforcement LearningCode0
Learning to Teach in Cooperative Multiagent Reinforcement Learning0
A Lyapunov-based Approach to Safe Reinforcement LearningCode0
Learning Real-World Robot Policies by Dreaming0
Episodic Memory Deep Q-Networks0
Reinforcement Learning of Theorem Proving0
Solving the Rubik's Cube Without Human KnowledgeCode0
Two geometric input transformation methods for fast online reinforcement learning with neural nets0
Improving Image Captioning with Conditional Generative Adversarial NetsCode0
Hierarchical Reinforcement Learning with Deep Nested Agents0
Evolutionary RL for Container Loading0
Learning Time-Sensitive Strategies in Space FortressCode0
Language Expansion In Text-Based Games0
Deep Reinforcement Learning for Resource Management in Network Slicing0
Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning0
FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning0
Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning0
The Hierarchical Adaptive Forgetting Variational Filter0
Leveraging human knowledge in tabular reinforcement learning: A study of human subjects0
Do deep reinforcement learning agents model intentions?Code0
Graph Signal Sampling via Reinforcement Learning0
Feedback-Based Tree Search for Reinforcement Learning0
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning ApproachCode0
GAN Q-learningCode0
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery0
Generating Rescheduling Knowledge using Reinforcement Learning in a Cognitive Architecture0
Towards Autonomous Reinforcement Learning: Automatic Setting of Hyper-parameters using Bayesian Optimization0
Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes0
Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration0
Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis0
Show:102550
← PrevPage 277 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified