SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 72017250 of 15113 papers

TitleStatusHype
Stability Constrained Reinforcement Learning for Real-Time Voltage Control0
Modeling Interactions of Autonomous Vehicles and Pedestrians with Deep Multi-Agent Reinforcement Learning for Collision Avoidance0
Reinforcement Learning with Information-Theoretic Actuation0
Unified Data Collection for Visual-Inertial Calibration via Deep Reinforcement LearningCode1
Is Policy Learning Overrated?: Width-Based Planning and Active Learning for AtariCode0
Scalable Online Planning via Reinforcement Learning Fine-TuningCode1
Solving the Real Robot Challenge using Deep Reinforcement LearningCode0
Reinforcement Learning for Classical Planning: Viewing Heuristics as Dense Reward Generators0
Surveillance Evasion Through Bayesian Reinforcement LearningCode0
A Privacy-preserving Distributed Training Framework for Cooperative Multi-agent Deep Reinforcement Learning0
HLIC: Harmonizing Optimization Metrics in Learned Image Compression by Reinforcement Learning0
Bitcoin Transaction Strategy Construction Based on Deep Reinforcement Learning0
Coordinated Reinforcement Learning for Optimizing Mobile Networks0
Generalized Maximum Entropy Reinforcement Learning via Reward Shaping0
CubeTR: Learning to Solve the Rubik's Cube using Transformers0
Policy improvement by planning with GumbelCode2
Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning0
0
Variational oracle guiding for reinforcement learning0
Plan Your Target and Learn Your Skills: State-Only Imitation Learning via Decoupled Policy Optimization0
WaveCorr: Deep Reinforcement Learning with Permutation Invariant Policy Networks for Portfolio Management0
Polyphonic Music Composition: An Adversarial Inverse Reinforcement Learning Approach0
Particle Based Stochastic Policy Optimization0
Value Refinement Network (VRN)0
P4O: Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization0
Understanding the Generalization Gap in Visual Reinforcement Learning0
Uncertainty Regularized Policy Learning for Offline Reinforcement Learning0
Triangular Dropout: Variable Network Width without Retraining0
Meta Attention For Off-Policy Actor-Critic0
Metrics Matter: A Closer Look on Self-Paced Reinforcement Learning0
On Reward Maximization and Distribution Matching for Fine-Tuning Language Models0
Online Tuning for Offline Decentralized Multi-Agent Reinforcement Learning0
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities0
Model-based Reinforcement Learning with Ensembled Model-value Expansion0
Stability and Generalisation in Batch Reinforcement Learning0
Sequential Communication in Multi-Agent Reinforcement Learning0
Self-Supervised Structured Representations for Deep Reinforcement Learning0
Nested Policy Reinforcement Learning for Clinical Decision Support0
MURO: Deployment Constrained Reinforcement Learning with Model-based Uncertainty Regularized Batch Optimization0
Safe Exploration in Linear Equality Constraint0
Can Reinforcement Learning Efficiently Find Stackelberg-Nash Equilibria in General-Sum Markov Games?0
Continuous Deep Q-Learning in Optimal Control Problems: Normalized Advantage Functions Analysis0
An Attempt to Model Human Trust with Reinforcement Learning0
DSDF: Coordinated look-ahead strategy in stochastic multi-agent reinforcement learning0
Boosted Curriculum Reinforcement Learning0
Detecting Worst-case Corruptions via Loss Landscape Curvature in Deep Reinforcement Learning0
Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning0
Learning to Solve Combinatorial Problems via Efficient Exploration0
Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning0
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment0
Show:102550
← PrevPage 145 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified