SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97019750 of 15113 papers

TitleStatusHype
Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning0
Self-Supervised Continuous Control without Policy Gradient0
Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium0
Re-examining Routing Networks for Multi-task Learning0
Scalable Bayesian Inverse Reinforcement Learning by Auto-Encoding Reward0
PAC-Bayesian Randomized Value Function with Informative Prior0
MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning0
On Trade-offs of Image Prediction in Visual Model-Based Reinforcement Learning0
Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates0
Structure and randomness in planning and reinforcement learningCode0
Understanding and Leveraging Causal Relations in Deep Reinforcement Learning0
Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks0
RECONNAISSANCE FOR REINFORCEMENT LEARNING WITH SAFETY CONSTRAINTS0
When Is Generalizable Reinforcement Learning Tractable?0
R-LAtte: Attention Module for Visual Control via Reinforcement Learning0
What are the Statistical Limits of Batch RL with Linear Function Approximation?0
Regioned Episodic Reinforcement Learning0
Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization0
Aspect-based Sentiment Classification via Reinforcement Learning0
A Simple Sparse Denoising Layer for Robust Deep Learning0
Learning Latent Landmarks for Generalizable Planning0
Coordinated Multi-Agent Exploration Using Shared Goals0
FSV: Learning to Factorize Soft Value Function for Cooperative Multi-Agent Reinforcement Learning0
Cross-State Self-Constraint for Feature Generalization in Deep Reinforcement Learning0
Learning Predictive Communication by Imagination in Networked System Control0
Learning from Demonstrations with Energy based Generative Adversarial Imitation Learning0
Incremental Policy Gradients for Online Reinforcement Learning Control0
Bounded Myopic Adversaries for Deep Reinforcement Learning Agents0
Learning Safe Policies with Cost-sensitive Advantage Estimation0
An Examination of Preference-based Reinforcement Learning for Treatment Recommendation0
A Robust Fuel Optimization Strategy For Hybrid Electric Vehicles: A Deep Reinforcement Learning Based Continuous Time Design Approach0
BRAC+: Going Deeper with Behavior Regularized Offline Reinforcement Learning0
Deep Reinforcement Learning With Adaptive Combined Critics0
Hindsight Curriculum Generation Based Multi-Goal Experience Replay0
Distributional Reinforcement Learning for Risk-Sensitive Policies0
Learning Efficient Planning-based Rewards for Imitation Learning0
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning0
Entropic Risk-Sensitive Reinforcement Learning: A Meta Regret Framework with Function Approximation0
Discrete Predictive Representation for Long-horizon Planning0
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms0
A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING0
Learning a Transferable Scheduling Policy for Various Vehicle Routing Problems based on Graph-centric Representation Learning0
Learning to communicate through imagination with model-based deep multi-agent reinforcement learning0
Average Reward Reinforcement Learning with Monotonic Policy Improvement0
Error Controlled Actor-Critic Method to Reinforcement Learning0
Learning to Dynamically Select Between Reward Shaping Signals0
Daylight: Assessing Generalization Skills of Deep Reinforcement Learning Agents0
Learning to Explore with Pleasure0
Learning Active Learning in the Batch-Mode Setup with Ensembles of Active Learning Agents0
Learning to Observe with Reinforcement Learning0
Show:102550
← PrevPage 195 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified