SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 94019425 of 15113 papers

TitleStatusHype
LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization0
Indoor Point-to-Point Navigation with Deep Reinforcement Learning and Ultra-wideband0
Weighted Entropy Modification for Soft Actor-Critic0
Adaptive Contention Window Design using Deep Q-learningCode1
Counterfactual Credit Assignment in Model-Free Reinforcement Learning0
C-Learning: Learning to Achieve Goals via Recursive Classification0
Explaining Conditions for Reinforcement Learning Behaviors from Real and Imagined Data0
Multi-agent Reinforcement Learning Accelerated MCMC on Multiscale Inversion Problem0
PassGoodPool: Joint Passengers and Goods Fleet Management with Reinforcement Learning aided Pricing, Matching, and Route Planning0
Modality-Buffet for Real-Time Object Detection0
SeekNet: Improved Human Instance Segmentation and Tracking via Reinforcement Learning Based Optimized Robot Relocation0
REALab: An Embedded Perspective on Tampering0
Reinforcement Learning of Graph Neural Networks for Service Function Chaining0
Fault-Aware Robust Control via Adversarial Reinforcement Learning0
Deep Reinforcement Learning for Stochastic Computation Offloading in Digital Twin Networks0
Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization0
Combining Reinforcement Learning with Model Predictive Control for On-Ramp MergingCode1
Deep Reinforcement Learning and Permissioned Blockchain for Content Caching in Vehicular Edge Computing and Networks0
Curiosity Based Reinforcement Learning on Robot Manufacturing Cell0
Leveraging the Variance of Return Sequences for Exploration Policy0
Towards Learning Controllable Representations of Physical Systems0
Towards a General Framework for ML-based Self-tuning Databases0
Scalable Reinforcement Learning Policies for Multi-Agent ControlCode1
NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing TasksCode1
Value Function Approximations via Kernel Embeddings for No-Regret Reinforcement Learning0
Show:102550
← PrevPage 377 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified