SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 82518300 of 15113 papers

TitleStatusHype
Greedy-based Value Representation for Efficient Coordination in Multi-agent Reinforcement Learning0
Deep Inverse Reinforcement Learning via Adversarial One-Class Classification0
Interpreting Reinforcement Policies through Local Behaviors0
Adaptive Graph Capsule Convolutional Networks0
Bayesian Exploration for Lifelong Reinforcement Learning0
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment0
Combinatorial Reinforcement Learning Based Scheduling for DNN Execution on Edge0
Joint Self-Supervised Learning for Vision-based Reinforcement Learning0
A Principled Permutation Invariant Approach to Mean-Field Multi-Agent Reinforcement Learning0
Faster Reinforcement Learning with Value Target Lower Bounding0
DSDF: Coordinated look-ahead strategy in stochastic multi-agent reinforcement learning0
Adaptive Q-learning for Interaction-Limited Reinforcement Learning0
DiBB: Distributing Black-Box Optimization0
Closed-Loop Control of Additive Manufacturing via Reinforcement Learning0
Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations0
Boosted Curriculum Reinforcement Learning0
Reachability Traces for Curriculum Design in Reinforcement Learning0
Text Generation with Efficient (Soft) Q-Learning0
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities0
OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning0
0
Sequential Communication in Multi-Agent Reinforcement Learning0
Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers0
The Essential Elements of Offline RL via Supervised Learning0
The guide and the explorer: smart agents for resource-limited iterated batch reinforcement learning0
Resmax: An Alternative Soft-Greedy Operator for Reinforcement Learning0
Plan Your Target and Learn Your Skills: State-Only Imitation Learning via Decoupled Policy Optimization0
Self-Supervised Structured Representations for Deep Reinforcement Learning0
Multi-Agent Reinforcement Learning with Shared Resource in Inventory Management0
Theoretical understanding of adversarial reinforcement learning via mean-field optimal control0
Multi-batch Reinforcement Learning via Sample Transfer and Imitation Learning0
The Remarkable Effectiveness of Combining Policy and Value Networks in A*-based Deep RL for AI Planning0
Offline-Online Reinforcement Learning: Extending Batch and Online RL0
P4O: Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization0
Rethinking Pareto Approaches in Constrained Reinforcement Learning0
Offline Pre-trained Multi-Agent Decision Transformer0
Should I Run Offline Reinforcement Learning or Behavioral Cloning?0
Selective Token Generation for Few-shot Language Modeling0
Offline Reinforcement Learning for Large Scale Language Action Spaces0
Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning0
Targeted Environment Design from Offline Data0
Revisiting the Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning0
Offline Reinforcement Learning with Resource Constrained Online Deployment0
Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm0
Towards Unknown-aware Deep Q-Learning0
Model-based Reinforcement Learning with Ensembled Model-value Expansion0
Rewardless Open-Ended Learning (ROEL)0
Transformers are Meta-Reinforcement Learners0
Triangular Dropout: Variable Network Width without Retraining0
MOBA: Multi-teacher Model Based Reinforcement Learning0
Show:102550
← PrevPage 166 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified