SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1300113050 of 15113 papers

TitleStatusHype
Learning Deterministic Policy with Target for Power Control in Wireless Networks0
Statistics and Samples in Distributional Reinforcement Learning0
Curiosity-Driven Experience Prioritization via Density Estimation0
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPsCode0
From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following0
Emergent Coordination Through Competition0
DOM-Q-NET: Grounded RL on Structured LanguageCode0
Deep Reinforcement Learning using Genetic Algorithm for Parameter OptimizationCode0
A novel repetition normalized adversarial reward for headline generation0
Hyperbolic Discounting and Learning over Multiple HorizonsCode0
Investigating Generalisation in Continuous Deep Reinforcement Learning0
Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning0
Parenting: Safe Reinforcement Learning from Human Input0
A new Potential-Based Reward Shaping for Reinforcement Learning Agent0
Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning0
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial PuzzlesCode0
Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic0
Asynchronous Coagent Networks0
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement LearningCode0
Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations0
Unsupervised Visuomotor Control through Distributional Planning NetworksCode0
Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning0
Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems0
Reinforcement Learning for UA V Attitude Control0
Preferences Implicit in the State of the WorldCode0
ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning0
Deep Reinforcement Learning from Policy-Dependent Human Feedback0
Latent Space Reinforcement Learning for Steering Angle Prediction0
Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous FlightCode0
Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective0
Stochastic Reinforcement Learning0
Whole-Chain Recommendations0
WiseMove: A Framework for Safe Deep Reinforcement Learning for Autonomous Driving0
A Bandit Framework for Optimal Selection of Reinforcement Learning Agents0
Reinforcement Learning from Hierarchical CriticsCode0
Distributional reinforcement learning with linear function approximation0
Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric DistanceCode0
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach0
Metaoptimization on a Distributed System for Deep Reinforcement Learning0
Visual search and recognition for robot task execution and monitoring0
Artificial Intelligence for Prosthetics - challenge solutionsCode0
Bayesian Reinforcement Learning via Deep, Sparse SamplingCode0
Decentralized Multi-Agents by Imitation of a Centralized Controller0
A Guiding Principle for Causal Decision Problems0
On L_2-consistency of nearest neighbor matching0
Neural Fictitious Self-Play on ELF Mini-RTS0
Space Navigator: a Tool for the Optimization of Collision Avoidance Maneuvers0
Separating value functions across time-scalesCode0
Reinforcement Learning for Optimal Load Distribution Sequencing in Resource-Sharing System0
Polyphonic Music Composition with LSTM Neural Networks and Reinforcement Learning0
Show:102550
← PrevPage 261 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified