SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1235112400 of 15113 papers

TitleStatusHype
Injecting Prior Knowledge for Transfer Learning into Reinforcement Learning Algorithms using Logic Tensor Networks0
Self-Tuning Sectorization: Deep Reinforcement Learning Meets Broadcast Beam Optimization0
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle0
Epistemic Risk-Sensitive Reinforcement Learning0
Cross-View Policy Learning for Street NavigationCode0
Sub-policy Adaptation for Hierarchical Reinforcement Learning0
Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning0
Deep Reinforcement Learning for Cyber Security0
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural RewardsCode0
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application0
Goal-conditioned Imitation LearningCode0
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks0
Reinforcement Knowledge Graph Reasoning for Explainable RecommendationCode0
When to use parametric models in reinforcement learning?Code1
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function0
Adaptive Optimal Control for Reference Tracking Independent of Exo-System Dynamics0
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning0
Sub-Goal Trees -- a Framework for Goal-Directed Trajectory Prediction and Optimization0
Reinforcement Learning of Minimalist Numeral Grammars0
Reinforcement Learning for Channel Coding: Learned Bit-Flipping DecodingCode0
Reinforcement Learning for Integer Programming: Learning to Cut0
Towards Inverse Reinforcement Learning for Limit Order Book Dynamics0
Learning to Score Behaviors for Guided Policy OptimizationCode0
Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning0
Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer0
A Hybrid Approach Between Adversarial Generative Networks and Actor-Critic Policy Gradient for Low Rate High-Resolution Image Compression0
Causal Discovery with Reinforcement Learning0
Exploration via Hindsight Goal GenerationCode0
Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing0
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
A Survey of Reinforcement Learning Informed by Natural Language0
Neural Keyphrase Generation via Reinforcement Learning with Adaptive RewardsCode0
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal0
Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningCode0
Curiosity-Driven Multi-Criteria Hindsight Experience ReplayCode0
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement LearningCode0
Neural Heterogeneous Scheduler0
Transfer Learning by Modeling a Distribution over Policies0
SVRG for Policy Evaluation with Fewer Gradient EvaluationsCode0
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling0
Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach0
Preference-based Interactive Multi-Document SummarisationCode0
Worst-Case Regret Bounds for Exploration via Randomized Value Functions0
Non-Stationary Reinforcement Learning: The Blessing of (More) Optimism0
Ego-Pose Estimation and Forecasting as Real-Time PD ControlCode0
Improving Exploration in Soft-Actor-Critic with Normalizing Flows PoliciesCode0
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP0
Towards Interpretable Reinforcement Learning Using Attention Augmented AgentsCode0
Clustered Reinforcement Learning0
An Extensible Interactive Interface for Agent Design0
Show:102550
← PrevPage 248 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified