SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1095111000 of 15113 papers

TitleStatusHype
SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks0
Exploring Unknown States with Action BalanceCode0
Explore and Exploit with Heterotic Line Bundle ModelsCode0
Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey0
Automatic Curriculum Learning For Deep RL: A Short Survey0
Advancing Renewable Electricity Consumption With Reinforcement Learning0
Transfer Reinforcement Learning under Unobserved Contextual Information0
Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces0
Stable Policy Optimization via Off-Policy Divergence RegularizationCode0
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison0
Human AI interaction loop training: New approach for interactive reinforcement learning0
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate0
Deep Adversarial Reinforcement Learning for Object Disentangling0
On the Robustness of Cooperative Multi-Agent Reinforcement LearningCode1
Reinforcement Learning Based Cooperative Coded Caching under Dynamic Popularities in Ultra-Dense Networks0
Reinforcement Learning for Combinatorial Optimization: A Survey0
Convergence of Q-value in case of Gaussian rewards0
IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal ControlCode1
Lane-Merging Using Policy-based Reinforcement Learning and Post-Optimization0
Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning0
Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning0
Deep Reinforcement Learning-BasedRobust Protection in DER-Rich Distribution Grids0
Efficient and Effective Similar Subtrajectory Search with Deep Reinforcement Learning0
Distributional Robustness and Regularization in Reinforcement Learning0
Reward Design in Cooperative Multi-agent Reinforcement Learning for Packet Routing0
A Geometric Perspective on Visual Imitation Learning0
Dynamic Experience Replay0
Privacy-Aware Time-Series Data Sharing with Deep Reinforcement Learning0
Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation0
Efficient statistical validation with edge cases to evaluate Highly Automated Vehicles0
Deep Reinforcement Learning for QoS-Constrained Resource Allocation in Multiservice Networks0
Embodied Synaptic Plasticity with Online Reinforcement learningCode1
Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path0
Contention Window Optimization in IEEE 802.11ax Networks with Deep Reinforcement LearningCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning0
Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization0
Robust Market Making via Adversarial Reinforcement LearningCode1
Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning0
PPMC RL Training Algorithm: Rough Terrain Intelligent Robots through Reinforcement LearningCode1
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss0
Real-World Human-Robot Collaborative Reinforcement Learning0
Dynamic Queue-Jump Lane for Emergency Vehicles under Partially Connected Settings: A Multi-Agent Deep Reinforcement Learning Approach0
Scaling Up Multiagent Reinforcement Learning for Robotic Systems: Learn an Adaptive Sparse Communication Graph0
MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic NavigationCode1
Risk-Averse Learning by Temporal Difference Methods0
Adaptive Structural Hyper-Parameter Configuration by Q-Learning0
Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning0
Cluster-Based Social Reinforcement Learning0
Gaussian Process Policy Optimization0
Show:102550
← PrevPage 220 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified