SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 78267850 of 15113 papers

TitleStatusHype
THE SJTU SYSTEM FOR DCASE2021 CHALLENGE TASK 6: AUDIO CAPTIONING BASED ON ENCODER PRE-TRAINING AND REINFORCEMENT LEARNINGCode1
Meta-Reinforcement Learning for Heuristic Planning0
Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement LearningCode1
A Unified Off-Policy Evaluation Approach for General Value Function0
A Short Note on the Relationship of Information Gain and Eluder Dimension0
AdaRL: What, Where, and How to Adapt in Transfer Reinforcement LearningCode1
Gradient Importance Learning for Incomplete ObservationsCode0
Control of rough terrain vehicles using deep reinforcement learning0
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement LearningCode0
A Review of Explainable Artificial Intelligence in Manufacturing0
Agents that Listen: High-Throughput Reinforcement Learning with Multiple Sensory SystemsCode1
The Least Restriction for Offline Reinforcement Learning0
Winning at Any Cost -- Infringing the Cartel Prohibition With Reinforcement Learning0
Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and ExploitationCode1
Low Dimensional State Representation Learning with Robotics Priors in Continuous Action Spaces0
Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics0
Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning0
Traffic Signal Control with Communicative Deep Reinforcement Learning Agents: a Case Study0
Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement LearningCode0
Mava: a research library for distributed multi-agent reinforcement learning in JAXCode1
Examining average and discounted reward optimality criteria in reinforcement learning0
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning0
Controlled Interacting Particle Algorithms for Simulation-based Reinforcement LearningCode0
RL-NCS: Reinforcement learning based data-driven approach for nonuniform compressed sensingCode0
Reinforcement Learning for Feedback-Enabled Cyber Resilience0
Show:102550
← PrevPage 314 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified