SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1020110225 of 15113 papers

TitleStatusHype
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification0
Negotiating Team Formation Using Deep Reinforcement Learning0
Quality of service based radar resource management using deep reinforcement learning0
Learning by Competition of Self-Interested Reinforcement Learning AgentsCode0
Imitation with Neural Density Models0
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification0
A case for new neural networks smoothness constraints0
Chance-Constrained Control with Lexicographic Deep Reinforcement Learning0
A Reinforcement Learning Approach to Health Aware Control Strategy0
Average-reward model-free reinforcement learning: a systematic review and literature mapping0
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPsCode0
Multi-Agent Reinforcement Learning in NOMA-aided UAV Networks for Cellular Offloading0
Model-Based Inverse Reinforcement Learning from Visual Demonstrations0
Neural Algorithms for Graph Navigation0
Scalable Evolution Strategies Pipeline for Solving the Vehicle Routing Problem0
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning0
Learning Elimination Ordering for Tree Decomposition Problem0
Assessment of Reward Functions in Reinforcement Learning for Multi-Modal Urban Traffic Control under Real-World limitations0
Learning Lower Bounds for Graph Exploration With Reinforcement Learning0
Interpretable Disease Prediction based on Reinforcement Path Reasoning over Knowledge Graphs0
DOOM: A Novel Adversarial-DRL-Based Op-Code Level Metamorphic Malware Obfuscator for the Enhancement of IDS0
Few-shot model-based adaptation in noisy conditions0
Decomposability and Parallel Computation of Multi-Agent LQR0
Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning with Intrinsic-Extrinsic Modeling0
Hyperparameter Auto-tuning in Self-Supervised Robotic LearningCode0
Show:102550
← PrevPage 409 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified