SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1250112550 of 15113 papers

TitleStatusHype
Injecting Prior Knowledge for Transfer Learning into Reinforcement Learning Algorithms using Logic Tensor Networks0
Epistemic Risk-Sensitive Reinforcement Learning0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Self-Tuning Sectorization: Deep Reinforcement Learning Meets Broadcast Beam Optimization0
Sub-policy Adaptation for Hierarchical Reinforcement Learning0
Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning0
Goal-conditioned Imitation LearningCode0
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural RewardsCode0
Cross-View Policy Learning for Street NavigationCode0
Deep Reinforcement Learning for Cyber Security0
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application0
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks0
Sub-Goal Trees -- a Framework for Goal-Directed Trajectory Prediction and Optimization0
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function0
Adaptive Optimal Control for Reference Tracking Independent of Exo-System Dynamics0
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning0
Reinforcement Knowledge Graph Reasoning for Explainable RecommendationCode0
Reinforcement Learning for Channel Coding: Learned Bit-Flipping DecodingCode0
Reinforcement Learning of Minimalist Numeral Grammars0
Reinforcement Learning for Integer Programming: Learning to Cut0
Learning to Score Behaviors for Guided Policy OptimizationCode0
Towards Inverse Reinforcement Learning for Limit Order Book Dynamics0
A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression0
Causal Discovery with Reinforcement Learning0
Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer0
Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning0
Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing0
Exploration via Hindsight Goal GenerationCode0
A Survey of Reinforcement Learning Informed by Natural Language0
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal0
Neural Keyphrase Generation via Reinforcement Learning with Adaptive RewardsCode0
SVRG for Policy Evaluation with Fewer Gradient EvaluationsCode0
Transfer Learning by Modeling a Distribution over Policies0
Neural Heterogeneous Scheduler0
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement LearningCode0
Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningCode0
Curiosity-Driven Multi-Criteria Hindsight Experience ReplayCode0
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling0
Preference-based Interactive Multi-Document SummarisationCode0
Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach0
Worst-Case Regret Bounds for Exploration via Randomized Value Functions0
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism0
Ego-Pose Estimation and Forecasting as Real-Time PD ControlCode0
Deep Reinforcement Learning for Multi-objective Optimization0
DeepMDP: Learning Continuous Latent Space Models for Representation Learning0
Combining Reinforcement Learning and Configuration Checking for Maximum k-plex Problem0
An Extensible Interactive Interface for Agent Design0
Improving Exploration in Soft-Actor-Critic with Normalizing Flows PoliciesCode0
Clustered Reinforcement Learning0
Towards Interpretable Reinforcement Learning Using Attention Augmented AgentsCode0
Show:102550
← PrevPage 251 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified