SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 67516775 of 15113 papers

TitleStatusHype
Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL0
Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning0
MACC: Cross-Layer Multi-Agent Congestion Control with Deep Reinforcement Learning0
Reinforcement Learning with Neural Radiance Fields0
Offline Reinforcement Learning with Causal Structured World Models0
Disentangling Epistemic and Aleatoric Uncertainty in Reinforcement Learning0
A Deep Reinforcement Learning Framework For Column GenerationCode0
Joint Energy Dispatch and Unit Commitment in Microgrids Based on Deep Reinforcement Learning0
KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems0
Equivariant Reinforcement Learning for Quadrotor UAV0
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning0
Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards0
HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning0
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games0
Policy Gradient Algorithms with Monte Carlo Tree Learning for Non-Markov Decision Processes0
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning0
RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning0
Reinforcement learning based parameters adaption method for particle swarm optimization0
Offline Reinforcement Learning with Differential Privacy0
RLSS: A Deep Reinforcement Learning Algorithm for Sequential Scene Generation0
On Gap-dependent Bounds for Offline Reinforcement Learning0
Predecessor Features0
Model Generation with Provable Coverability for Offline Reinforcement Learning0
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus0
Provably Efficient Lifelong Reinforcement Learning with Linear Function Approximation0
Show:102550
← PrevPage 271 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified