SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1315113200 of 15113 papers

TitleStatusHype
MEETING BOT: Reinforcement Learning for Dialogue Based Meeting Scheduling0
Quantum Adiabatic Algorithm Design using Reinforcement Learning0
Generative Adversarial User Model for Reinforcement Learning Based Recommendation SystemCode0
Dealing with Limited Backhaul Capacity in Millimeter Wave Systems: A Deep Reinforcement Learning Approach0
A New Concept of Deep Reinforcement Learning based Augmented General Sequence Tagging System0
Learning to Walk via Deep Reinforcement Learning0
Deconfounding Reinforcement Learning in Observational SettingsCode0
Optimizing Market Making using Multi-Agent Reinforcement Learning0
VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control0
Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic ControlCode0
Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning0
Learning to Navigate the Web0
NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning0
Pre-training with Non-expert Human Demonstration for Deep Reinforcement LearningCode0
Optimizing Quantum Error Correction Codes with Reinforcement Learning0
A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search0
TD-Regularized Actor-Critic MethodsCode0
Universal Successor Features ApproximatorsCode0
Information-Directed Exploration for Deep Reinforcement LearningCode0
Incentive-based demand response for smart grid with reinforcement learning and deep neural network0
Deep reinforcement learning for search, recommendation, and online advertising: a survey0
Domain Adaptation for Reinforcement Learning on the Atari0
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning AgentsCode0
A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search0
Fuzzy Controller of Reward of Reinforcement Learning For Handwritten Digit Recognition0
Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing0
Malthusian Reinforcement Learning0
Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning ApproachCode0
Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning0
Residual Policy LearningCode0
Scaling shared model governance via model splitting0
Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function0
IRLAS: Inverse Reinforcement Learning for Architecture SearchCode0
Learning to Communicate: A Machine Learning Framework for Heterogeneous Multi-Agent Robotic Systems0
Exploration Conscious Reinforcement Learning RevisitedCode0
A predictive safety filter for learning-based control of constrained nonlinear dynamical systems0
Efficient Model-Free Reinforcement Learning Using Gaussian Process0
KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning0
Dialogue Generation: From Imitation Learning to Inverse Reinforcement LearningCode0
The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint0
Learning Montezuma's Revenge from a Single Demonstration0
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning0
Residual Reinforcement Learning for Robot Control0
Measuring and Characterizing Generalization in Deep Reinforcement Learning0
ToyBox: Better Atari Environments for Testing Reinforcement Learning AgentsCode0
Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic ForgettingCode0
Active Deep Q-learning with Demonstration0
Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents0
Deep Reinforcement Learning and the Deadly Triad0
Adapting Auxiliary Losses Using Gradient Similarity0
Show:102550
← PrevPage 264 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified