SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1305113100 of 15113 papers

TitleStatusHype
Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning ApproachCode0
Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning0
Residual Policy LearningCode0
Simulation to Scaled City: Zero-Shot Policy Transfer for Traffic Control via Autonomous VehiclesCode2
Scaling shared model governance via model splitting0
Dopamine: A Research Framework for Deep Reinforcement LearningCode3
Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function0
Learning to Communicate: A Machine Learning Framework for Heterogeneous Multi-Agent Robotic Systems0
IRLAS: Inverse Reinforcement Learning for Architecture SearchCode0
Exploration Conscious Reinforcement Learning RevisitedCode0
Soft Actor-Critic Algorithms and ApplicationsCode1
A predictive safety filter for learning-based control of constrained nonlinear dynamical systems0
KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning0
Efficient Model-Free Reinforcement Learning Using Gaussian Process0
Dialogue Generation: From Imitation Learning to Inverse Reinforcement LearningCode0
The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint0
Learning Montezuma's Revenge from a Single Demonstration0
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning0
Residual Reinforcement Learning for Robot Control0
Off-Policy Deep Reinforcement Learning without ExplorationCode1
Measuring and Characterizing Generalization in Deep Reinforcement Learning0
Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic ForgettingCode0
Quantifying Generalization in Reinforcement LearningCode1
ToyBox: Better Atari Environments for Testing Reinforcement Learning AgentsCode0
Deep Reinforcement Learning and the Deadly Triad0
Active Deep Q-learning with Demonstration0
Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents0
Adapting Auxiliary Losses Using Gradient Similarity0
Composing Entropic Policies using Divergence Correction0
The effects of negative adaptation in Model-Agnostic Meta-Learning0
Relative Entropy Regularized Policy IterationCode0
Playing Text-Adventure Games with Graph-Based Deep Reinforcement LearningCode0
Exploration versus exploitation in reinforcement learning: a stochastic control approach0
Hyperbolic Embeddings for Learning Options in Hierarchical Reinforcement Learning0
Learning Vine Copula Models For Synthetic Data Generation0
Deep Reinforcement Learning for Intelligent Transportation Systems0
FoldingZero: Protein Folding from Scratch in Hydrophobic-Polar Model0
Generative Adversarial Self-Imitation Learning0
Bach2Bach: Generating Music Using A Deep Reinforcement Learning Approach0
Towards Solving Text-based Games by Producing Adaptive Action SpacesCode0
Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic ControlCode0
Resource Constrained Deep Reinforcement Learning0
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations0
Mitigating Planner Overfitting in Model-Based Reinforcement Learning0
Revisiting the Softmax Bellman Operator: New Benefits and New PerspectiveCode0
Macro action selection with deep reinforcement learning in StarCraftCode0
Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach0
Simple random search of static linear policies is competitive for reinforcement learningCode0
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis0
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning0
Show:102550
← PrevPage 262 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified