SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1250112550 of 15113 papers

TitleStatusHype
Continual Reinforcement Learning in 3D Non-stationary EnvironmentsCode0
Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar0
Scene Induced Multi-Modal Trajectory Forecasting via Planning0
PAC Guarantees for Cooperative Multi-Agent Reinforcement Learning with Restricted Communication0
Recurrent Value Functions0
Population-based Global Optimisation Methods for Learning Long-term Dependencies with RNNs0
Multi-hop Reading Comprehension via Deep Reinforcement Learning based Document TraversalCode0
Estimating Risk and Uncertainty in Deep Reinforcement LearningCode0
From semantics to execution: Integrating action planning with reinforcement learning for robotic causal problem-solving0
Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable PoliciesCode0
Unknown mixing times in apprenticeship and reinforcement learning0
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment0
Inverse Reinforcement Learning in Contextual MDPsCode0
Hierarchical Reinforcement Learning for Quadruped Locomotion0
Deep Reinforcement Learning for Detecting Malicious Websites0
COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven ExplorationCode0
Stochastic Inverse Reinforcement Learning0
Maximum Entropy-Regularized Multi-Goal Reinforcement LearningCode1
Issues concerning realizability of Blackwell optimal policies in reinforcement learning0
Deep Reinforcement Learning Based Parameter Control in Differential EvolutionCode0
A Bayesian Approach to Robust Reinforcement Learning0
Stochastic Variance Reduction for Deep Q-learning0
Reinforcement Learning without Ground-Truth State0
Perceptual Values from Observation0
Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial0
Evolving Rewards to Automate Reinforcement Learning0
In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study0
Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning0
A Regularized Opponent Model with Maximum Entropy ObjectiveCode0
Exact-K Recommendation via Maximal Clique OptimizationCode0
Deep Reinforcement Learning-Based Channel Allocation for Wireless LANs with Graph Convolutional Networks0
Stratospheric Aerosol Injection as a Deep Reinforcement Learning Problem0
TBQ(σ): Improving Efficiency of Trace Utilization for Off-Policy Reinforcement Learning0
MaMiC: Macro and Micro Curriculum for Robotic Reinforcement Learning0
Stochastically Dominant Distributional Reinforcement Learning0
Mastering the Game of Sungka from Random PlayCode0
Meta-Reinforcement Learning for Adaptive Autonomous Driving0
Sub-policy Adaptation for Hierarchical Reinforcement Learning0
Goal-conditioned Imitation Learning0
Learning Exploration Policies for Model-Agnostic Meta-Reinforcement Learning0
Deep Knowledge Based Agent: Learning to do tasks by self-thinking about imaginary worlds0
QBSO-FS: A Reinforcement Learning Based Bee Swarm Optimization Metaheuristic for Feature SelectionCode0
Meta Reinforcement Learning with Task Embedding and Shared PolicyCode0
Random Expert Distillation: Imitation Learning via Expert Policy Support EstimationCode0
Leveraging exploration in off-policy algorithms via normalizing flowsCode0
Knowledge-Based Sequential Decision-Making Under Uncertainty0
A Learning based Branch and Bound for Maximum Common Subgraph Problems0
Autonomous Penetration Testing using Reinforcement Learning0
Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment0
Deep Reinforcement Learning for Scheduling in Cellular Networks0
Show:102550
← PrevPage 251 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified