SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1380113850 of 15113 papers

TitleStatusHype
Prosocial learning agents solve generalized Stag Hunts better than selfish onesCode0
Meta Reinforcement Learning with Finite Training Tasks -- a Density Estimation ApproachCode0
Deployable Reinforcement Learning with Variable Control RateCode0
Action-Decision Networks for Visual Tracking With Deep Reinforcement LearningCode0
A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware PerspectiveCode0
Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems through Probabilistic Model CheckingCode0
DenseLight: Efficient Control for Large-scale Traffic Signals with Dense FeedbackCode0
Learning to Communicate Functional States with Nonverbal Expressions for Improved Human-Robot CollaborationCode0
Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement LearningCode0
Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy LabelsCode0
Exploring RL-based LLM Training for Formal Language Tasks with Programmed RewardsCode0
A view on learning robust goal-conditioned value functions: Interplay between RL and MPCCode0
Constrained Policy OptimizationCode0
Input Convex Neural NetworksCode0
Constrained Exploration and Recovery from Experience ShapingCode0
Exploring the Impact of Tunable Agents in Sequential Social DilemmasCode0
Conservative Q-Improvement: Reinforcement Learning for an Interpretable Decision-Tree PolicyCode0
Active inference: demystified and comparedCode0
AutoRL Hyperparameter LandscapesCode0
Autoregressive Policies for Continuous Control Deep Reinforcement LearningCode0
Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State ObservationsCode0
IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-TuningCode0
Exploring the robustness of TractOracle methods in RL-based tractographyCode0
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning AgentsCode0
Insights From the NeurIPS 2021 NetHack ChallengeCode0
Lessons learned from field demonstrations of model predictive control and reinforcement learning for residential and commercial HVAC: A reviewCode0
Exploring Unknown States with Action BalanceCode0
Designing Neural Network Architectures using Reinforcement LearningCode0
Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option TemplatesCode0
Hierarchically Structured Task-Agnostic Continual LearningCode0
Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation GuidelinesCode0
Exponential Family Model-Based Reinforcement Learning via Score MatchingCode0
Hierarchical Meta Reinforcement Learning for Multi-Task EnvironmentsCode0
Approximate Model-Based Shielding for Safe Reinforcement LearningCode0
Multi-task Learning and Catastrophic Forgetting in Continual Reinforcement LearningCode0
Hierarchical Object Detection with Deep Reinforcement LearningCode0
Instance based Generalization in Reinforcement LearningCode0
Delta Schema Network in Model-based Reinforcement LearningCode0
Learning to Communicate with Deep Multi-Agent Reinforcement LearningCode0
Conservative Optimistic Policy Optimization via Multiple Importance SamplingCode0
Detecting Adversarial Attacks on Neural Network Policies with Visual ForesightCode0
Accuracy-based Curriculum Learning in Deep Reinforcement LearningCode0
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement LearningCode0
Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic FunctionsCode0
Action-Conditional Video Prediction using Deep Networks in Atari GamesCode0
Detecting Spiky Corruption in Markov Decision ProcessesCode0
Instance Selection for Dynamic Algorithm Configuration with Reinforcement Learning: Improving GeneralizationCode0
Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement LearningCode0
Deterministic Implementations for Reproducibility in Deep Reinforcement LearningCode0
Deterministic Policy Gradient AlgorithmsCode0
Show:102550
← PrevPage 277 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified