SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1115111200 of 15113 papers

TitleStatusHype
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces0
Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting0
Multi Type Mean Field Reinforcement LearningCode1
Temporal-adaptive Hierarchical Reinforcement Learning0
Soft Hindsight Experience ReplayCode1
Provably Efficient Online Hyperparameter Optimization with Population-Based BanditsCode1
Social diversity and social preferences in mixed-motive reinforcement learning0
Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline GenerationCode1
Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning0
Deep Radial-Basis Value Functions for Continuous Control0
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingCode1
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning FrameworkCode1
Learning Task-Driven Control Policies via Information Bottlenecks0
Bootstrapping a DQN Replay Memory with Synthetic Experiences0
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise0
Policy Gradient based Quantum Approximate Optimization Algorithm0
Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes0
Effective Diversity in Population Based Reinforcement LearningCode1
Evolutionary algorithms for constructing an ensemble of decision trees0
Deep Reinforcement Learning for Autonomous Driving: A Survey0
Integrating Deep Reinforcement Learning with Model-based Path Planners for Automated DrivingCode1
PolicyGNN: Aggregation Optimization for Graph Neural Networks0
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement LearningCode0
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine LearningCode1
Preventing Imitation Learning with Adversarial Policy Ensembles0
Locally Private Distributed Reinforcement Learning0
A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation0
Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning0
Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles0
Goal-directed graph construction using reinforcement learningCode1
Robust Multimodal Image Registration Using Deep Recurrent Reinforcement Learning0
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning0
Distal Explanations for Model-free Explainable Reinforcement Learning0
Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach0
Real-time calibration of coherent-state receivers: learning by trial and errorCode0
Some Insights into Lifelong Reinforcement Learning SystemsCode0
Rotation, Translation, and Cropping for Zero-Shot GeneralizationCode0
Unsupervised Program Synthesis for Images By Sampling Without Replacement0
Computing the Feedback Capacity of Finite State Channels using Reinforcement LearningCode0
Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement LearningCode0
Developing Multi-Task Recommendations with Long-Term Rewards via Policy Distilled Reinforcement Learning0
Reinforcement Learning-based Application Autoscaling in the Cloud: A Survey0
Tractable Reinforcement Learning of Signal Temporal Logic ObjectivesCode0
Sentiment and Knowledge Based Algorithmic Trading with Deep Reinforcement Learning0
Constrained Upper Confidence Reinforcement Learning0
Multitask radiological modality invariant landmark localization using deep reinforcement learningCode0
Deep Reinforcement Learning based Blind mmWave MIMO Beam Alignment0
Following Instructions by Imagining and Reaching Visual Goals0
Pricing commodity swing options0
EgoMap: Projective mapping and structured egocentric memory for Deep RL0
Show:102550
← PrevPage 224 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified