SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 29262950 of 15113 papers

TitleStatusHype
AI-as-a-Service Toolkit for Human-Centered Intelligence in Autonomous Driving0
AttackGNN: Red-Teaming GNNs in Hardware Security Using Reinforcement Learning0
A* Tree Search for Portfolio Management0
ACECODER: Acing Coder RL via Automated Test-Case Synthesis0
A Hysteretic Q-learning Coordination Framework for Emerging Mobility Systems in Smart Cities0
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules0
Constrained Reinforcement Learning Has Zero Duality Gap0
A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection0
Adaptive and Multiple Time-scale Eligibility Traces for Online Deep Reinforcement Learning0
Deciding What's Fair: Challenges of Applying Reinforcement Learning in Online Marketplaces0
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning0
Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making0
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories0
A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning0
A Hybrid PAC Reinforcement Learning Algorithm0
A Hybrid Neuro-Symbolic approach for Text-Based Games using Inductive Logic Programming0
Adaptive Aggregation for Safety-Critical Control0
Deceptive Reinforcement Learning for Privacy-Preserving Planning0
INTAGS: Interactive Agent-Guided Simulation0
At Human Speed: Deep Reinforcement Learning with Action Delay0
Adaptive Adversarial Training for Meta Reinforcement Learning0
A Hybrid Approach for Reinforcement Learning Using Virtual Policy Gradient for Balancing an Inverted Pendulum0
A Fast Convergence Theory for Offline Decision Making0
ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search0
AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning0
Show:102550
← PrevPage 118 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified