SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 60516100 of 15113 papers

TitleStatusHype
Investigation of Factorized Optical Flows as Mid-Level Representations0
Reinforced MOOCs Concept Recommendation in Heterogeneous Information Networks0
Multi-Agent Broad Reinforcement Learning for Intelligent Traffic Light Control0
Rényi State Entropy for Exploration Acceleration in Reinforcement Learning0
A Complete Characterization of Linear Estimators for Offline Policy Evaluation0
Curriculum-based Reinforcement Learning for Distribution System Critical Load RestorationCode1
Designing Heterogeneous GNNs with Desired Permutation Properties for Wireless Resource Allocation0
Distributed Control using Reinforcement Learning with Temporal-Logic-Based Reward Shaping0
Graph-based Reinforcement Learning meets Mixed Integer Programs: An application to 3D robot assembly discovery0
Policy-Based Bayesian Experimental Design for Non-Differentiable Implicit Models0
Robot Learning of Mobile Manipulation with Reachability Behavior Priors0
A Survey on Reinforcement Learning Methods in Character Animation0
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets0
Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation0
Deep Reinforcement Learning for Entity AlignmentCode1
Graph Neural Networks for Image Classification and Reinforcement Learning using Graph representations0
Influencing Long-Term Behavior in Multiagent Reinforcement LearningCode1
Efficient Policy Generation in Multi-Agent Systems via Hypergraph Neural Network0
Knowledge Transfer in Deep Reinforcement Learning for Slice-Aware Mobility Robustness Optimization0
Cascaded Gaps: Towards Gap-Dependent Regret for Risk-Sensitive Reinforcement Learning0
Scalable multi-agent reinforcement learning for distributed control of residential energy flexibility0
Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer EmpowermentCode1
Reinforcement Learning for Location-Aware Scheduling0
On Credit Assignment in Hierarchical Reinforcement LearningCode0
Black-Box Safety Validation of Autonomous Systems: A Multi-Fidelity Reinforcement Learning Approach0
Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit0
Watch from sky: machine-learning-based multi-UAV network for predictive police surveillance0
Recursive Reasoning Graph for Multi-Agent Reinforcement Learning0
Hierarchically Structured Scheduling and Execution of Tasks in a Multi-Agent Environment0
A Multi-Document Coverage Reward for RELAXed Multi-Document SummarizationCode0
Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation0
Leveraging Reward Gradients For Reinforcement Learning in Differentiable Physics Simulations0
Deep Reinforcement Learning based Model-free On-line Dynamic Multi-Microgrid Formation to Enhance Resilience0
Safe Reinforcement Learning for Legged Locomotion0
Target Network and Truncation Overcome The Deadly Triad in Q-Learning0
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions0
Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning0
GraspARL: Dynamic Grasping via Adversarial Reinforcement Learning0
Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model0
Intrinsically-Motivated Reinforcement Learning: A Brief Introduction0
Deep Q-network using reservoir computing with multi-layered readout0
On Practical Reinforcement Learning: Provable Robustness, Scalability, and Statistical EfficiencyCode0
The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches0
Testing Stationarity and Change Point Detection in Reinforcement LearningCode1
Quantum Reinforcement Learning via Policy Iteration0
Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning0
Reasoning about Counterfactuals to Improve Human Inverse Reinforcement LearningCode0
Reliable validation of Reinforcement Learning Benchmarks0
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power SystemsCode0
Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from ImagesCode0
Show:102550
← PrevPage 122 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified