SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1315113200 of 15113 papers

TitleStatusHype
Playing Atari with Six NeuronsCode0
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease TreatmentCode0
Training Transition Policies via Distribution Matching for Complex TasksCode0
Zero-Shot Reinforcement Learning via Function EncodersCode0
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental DynamicsCode0
TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning AgentsCode0
Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint ReplayCode0
Reinforcement Learning based Interconnection Routing for Adaptive Traffic OptimizationCode0
The Option-Critic ArchitectureCode0
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameCode0
A Generic Graph Sparsification Framework using Deep Reinforcement LearningCode0
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement LearningCode0
Multiple Object Recognition with Visual AttentionCode0
M^3RL: Mind-aware Multi-agent Management Reinforcement LearningCode0
Sparse Black-box Video Attack with Reinforcement LearningCode0
Trajectory-Based Off-Policy Deep Reinforcement LearningCode0
M^2DQN: A Robust Method for Accelerating Deep Q-learning NetworkCode0
Robust Constrained-MDPs: Soft-Constrained Robust Policy Optimization under Model UncertaintyCode0
Playing 2048 With Reinforcement LearningCode0
Sparsely ensembled convolutional neural network classifiers via reinforcement learningCode0
Vulnerability of Deep Reinforcement Learning to Policy Induction AttacksCode0
Meta-Learning of Structured Task Distributions in Humans and MachinesCode0
Multiple Landmark Detection using Multi-Agent Reinforcement LearningCode0
Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using SparsityCode0
Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic ProgrammingCode0
Theory of Mind for Deep Reinforcement Learning in HanabiCode0
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question GenerationCode0
Contrastive Multi-document Question GenerationCode0
Reinforcement Learning based Collective Entity Alignment with Adaptive FeaturesCode0
Reinforcement learning based adaptive metaheuristicsCode0
Regret-Based Defense in Adversarial Reinforcement LearningCode0
DRIBO: Robust Deep Reinforcement Learning via Multi-View Information BottleneckCode0
When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?Code0
Offline Behavior DistillationCode0
Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASHCode0
Robust Distant Supervision Relation Extraction via Deep Reinforcement LearningCode0
Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action SpacesCode0
Robust Visual Domain Randomization for Reinforcement LearningCode0
WALL-E: An Efficient Reinforcement Learning Research FrameworkCode0
The PlayStation Reinforcement Learning Environment (PSXLE)Code0
Variational Quantum Circuits for Deep Reinforcement LearningCode0
The Potential of the Return Distribution for Exploration in RLCode0
Robust exploration in linear quadratic reinforcement learningCode0
Maximum Reward Formulation In Reinforcement LearningCode0
Planning with Goal-Conditioned PoliciesCode0
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal AbstractionsCode0
Off Environment Evaluation Using Convex Risk MinimizationCode0
Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning AgentsCode0
Multi-objective Pointer Network for Combinatorial OptimizationCode0
Specializing Versatile Skill Libraries using Local Mixture of ExpertsCode0
Show:102550
← PrevPage 264 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified