SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1155111600 of 15113 papers

TitleStatusHype
IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation TasksCode0
Hebbian Synaptic Modifications in Spiking Neurons that Learn0
Inverse Reinforcement Learning with Missing Data0
Generalized Maximum Causal Entropy for Inverse Reinforcement Learning0
Off-Policy Policy Gradient Algorithms by Constraining the State Distribution Shift0
Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in HealthcareCode0
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance0
Empirical Study of Off-Policy Policy Evaluation for Reinforcement LearningCode0
Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient0
Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning0
Deep Reinforcement Learning for Adaptive Traffic Signal Control0
A Reduction from Reinforcement Learning to No-Regret Online Learning0
Gamifying the Vehicle Routing Problem with Stochastic Requests0
Reinforcement Learning for Market Making in a Multi-agent Dealer MarketCode0
Asymptotics of Reinforcement Learning with Neural Networks0
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning0
A Convergent Off-Policy Temporal Difference AlgorithmCode0
Learning to Communicate in Multi-Agent Reinforcement Learning : A Review0
Buffer-aware Wireless Scheduling based on Deep Reinforcement Learning0
Reinforcement Learning-Driven Test Generation for Android GUI Applications using Formal Specifications0
One-shot learning and behavioral eligibility traces in sequential decision making0
Schedule Earth Observation satellites with Deep Reinforcement Learning0
MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration0
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning0
Accelerating Training in Pommerman with Imitation and Reinforcement Learning0
Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement LearningCode1
Reinforcement-Learning-Based Variational Quantum Circuits Optimization for Combinatorial Problems0
SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement LearningCode0
Multi-Agent Connected Autonomous Driving using Deep Reinforcement LearningCode0
Real-Time Reinforcement LearningCode0
Driving Reinforcement Learning with ModelsCode0
Context-aware Active Multi-Step Reinforcement Learning0
DRiLLS: Deep Reinforcement Learning for Logic SynthesisCode0
Learning to Order Sub-questions for Complex Question Answering0
Value-Added Chemical Discovery Using Reinforcement Learning0
Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy0
Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing0
Model-Based Reinforcement Learning with Adversarial Training for Online RecommendationCode0
Hierarchical Reinforcement Learning Method for Autonomous Vehicle Behavior Planning0
Learning to reinforcement learn for Neural Architecture SearchCode0
Worst Cases Policy Gradients0
Contrastive Multi-document Question GenerationCode0
Fully Bayesian Recurrent Neural Networks for Safe Reinforcement Learning0
H_ Model-free Reinforcement Learning with Robust Stability GuaranteeCode0
Option Compatible Reward Inverse Reinforcement Learning0
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports0
MBCAL: Sample Efficient and Variance Reduced Reinforcement Learning for Recommender Systems0
Experience Sharing Between Cooperative Reinforcement Learning Agents0
Distributional Reward Decomposition for Reinforcement Learning0
Improving reinforcement learning algorithms: towards optimal learning rate policiesCode0
Show:102550
← PrevPage 232 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified