SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1440114450 of 15113 papers

TitleStatusHype
Dynamic Weights in Multi-Objective Deep Reinforcement LearningCode0
Asynchronous Methods for Model-Based Reinforcement LearningCode0
Illuminating Generalization in Deep Reinforcement Learning through Procedural Level GenerationCode0
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy LearningCode0
An Intrusion Response System utilizing Deep Q-Networks and System PartitionsCode0
Deep Coordination GraphsCode0
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingCode0
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement LearningCode0
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPsCode0
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image ProcessingCode0
Deep Attention Recurrent Q-NetworkCode0
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex EnvironmentsCode0
ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement LearningCode0
Deep Adaptive Multi-Intention Inverse Reinforcement LearningCode0
Challenges of Context and Time in Reinforcement Learning: Introducing Space Fortress as a BenchmarkCode0
Active One-shot LearningCode0
Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiencyCode0
Fully Parameterized Quantile Function for Distributional Reinforcement LearningCode0
Functional Acceleration for Policy Mirror DescentCode0
Active Object Localization with Deep Reinforcement LearningCode0
Challenges in High-dimensional Reinforcement Learning with Evolution StrategiesCode0
A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World RobotsCode0
Actively Learning Costly Reward Functions for Reinforcement LearningCode0
Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement LearningCode0
CGAR: Critic Guided Action Redistribution in Reinforcement LeaningCode0
Deep Active Inference as Variational Policy GradientsCode0
Decoupling regularization from the action spaceCode0
Long-Term Exploration in Persistent MDPsCode0
Learning to Navigate Using Mid-Level Visual PriorsCode0
Learning Approximate Stochastic Transition ModelsCode0
Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based roboticsCode0
Decoupling Dynamics and Reward for Transfer LearningCode0
CFlowNets: Continuous Control with Generative Flow NetworksCode0
Imagination-Augmented Agents for Deep Reinforcement LearningCode0
Learning Reward Machines for Partially Observable Reinforcement LearningCode0
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open ProblemsCode0
Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree SearchCode0
Deconfounding Reinforcement Learning in Observational SettingsCode0
AgGym: An agricultural biotic stress simulation environment for ultra-precision management planningCode0
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning ProgramsCode0
Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment RegimesCode0
Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement LearningCode0
Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned PoliciesCode0
GAC: A Deep Reinforcement Learning Model Toward User Incentivization in Unknown Social NetworksCode0
Certification of Iterative Predictions in Bayesian Neural NetworksCode0
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement LearningCode0
Toward Evaluating Robustness of Reinforcement Learning with Adversarial PolicyCode0
Effects of Spectral Normalization in Multi-agent Reinforcement LearningCode0
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement LearningCode0
Adaptive Symmetric Reward Noising for Reinforcement LearningCode0
Show:102550
← PrevPage 289 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified