SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 55515575 of 15113 papers

TitleStatusHype
Route Optimization via Environment-Aware Deep Network and Reinforcement Learning0
Routing algorithms as tools for integrating social distancing with emergency evacuation0
Routing and Placement of Macros using Deep Reinforcement Learning0
RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning0
RSO: A Novel Reinforced Swarm Optimization Algorithm for Feature Selection0
RTDK-BO: High Dimensional Bayesian Optimization with Reinforced Transformer Deep kernels0
Rule-Aware Reinforcement Learning for Knowledge Graph Reasoning0
Rule-Based Reinforcement Learning for Efficient Robot Navigation with Space Reduction0
Rule-Bottleneck Reinforcement Learning: Joint Explanation and Decision Optimization for Resource Allocation with Language Agents0
Rule Mining over Knowledge Graphs via Reinforcement Learning0
Run-and-tumble chemotaxis using reinforcement learning0
Runtime Adaptation in Wireless Sensor Nodes Using Structured Learning0
Run Time Assured Reinforcement Learning for Six Degree-of-Freedom Spacecraft Inspection0
Runtime Safety Assurance Using Reinforcement Learning0
Runtime Verification of Learning Properties for Reinforcement Learning Algorithms0
S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?0
S2VG: Soft Stochastic Value Gradient method0
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning0
SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics0
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling0
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons0
Safe and Psychologically Pleasant Traffic Signal Control with Reinforcement Learning using Action Masking0
Safe and Robust Reinforcement Learning: Principles and Practice0
Safe Approximate Dynamic Programming Via Kernelized Lipschitz Estimation0
Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics0
Show:102550
← PrevPage 223 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified