SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 72517300 of 15113 papers

TitleStatusHype
Safe Reinforcement Learning for Legged Locomotion0
Target Network and Truncation Overcome The Deadly Triad in Q-Learning0
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions0
GraspARL: Dynamic Grasping via Adversarial Reinforcement Learning0
Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning0
Intrinsically-Motivated Reinforcement Learning: A Brief Introduction0
Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model0
Deep Q-network using reservoir computing with multi-layered readout0
Reasoning about Counterfactuals to Improve Human Inverse Reinforcement LearningCode0
Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches0
Quantum Reinforcement Learning via Policy Iteration0
Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP0
Reliable validation of Reinforcement Learning Benchmarks0
Evolving Curricula with Regret-Based Environment Design0
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open ProblemsCode0
Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning0
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power SystemsCode0
Learning in Sparse Rewards settings through Quality-Diversity algorithms0
Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from ImagesCode0
DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction0
Hierarchical Reinforcement Learning with AI Planning ModelsCode0
Distributional Reinforcement Learning for Scheduling of Chemical Production Processes0
Explaining a Deep Reinforcement Learning Docking Agent Using Linear Model Trees with User Adapted Visualization0
Approximating a deep reinforcement learning docking agent using linear model trees0
A Theory of Abstraction in Reinforcement Learning0
On the Generalization of Representations in Reinforcement LearningCode0
Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation0
Probing the Robustness of Trained Metrics for Conversational Dialogue SystemsCode0
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity0
Weakly Supervised Disentangled Representation for Goal-conditioned Reinforcement Learning0
A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-Oriented Dialogue Policy Learning0
Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming0
RL-PGO: Reinforcement Learning-based Planar Pose-Graph OptimizationCode0
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite HorizonsCode0
Whittle Index based Q-Learning for Wireless Edge Caching with Linear Function Approximation0
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions0
Domain Knowledge-Based Automated Analog Circuit Design with Deep Reinforcement Learning0
Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach0
Decision Making in Non-Stationary Environments with Policy-Augmented Monte Carlo Tree Search0
Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option TemplatesCode0
Context-Hierarchy Inverse Reinforcement Learning0
Consolidated Adaptive T-soft Update for Deep Reinforcement Learning0
Reachability analysis in stochastic directed graphs by reinforcement learning0
Quantum Deep Reinforcement Learning for Robot Navigation TasksCode0
Learning Transferable Reward for Query Object Localization with Policy AdaptationCode0
Evolving-to-Learn Reinforcement Learning Tasks with Spiking Neural Networks0
Evolutionary Multi-Objective Reinforcement Learning Based Trajectory Control and Task Offloading in UAV-Assisted Mobile Edge Computing0
Show:102550
← PrevPage 146 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified