SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 36263650 of 15113 papers

TitleStatusHype
Alpha-Mini: Minichess Agent with Deep Reinforcement LearningCode0
Deep reinforcement learning for feedback control in a collective flashing ratchetCode0
Active Collection of Well-Being and Health Data in Mobile DevicesCode0
Generative Question Refinement with Deep Reinforcement Learning in Retrieval-based QA SystemCode0
Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid EnvironmentsCode0
Collaborative Deep Reinforcement LearningCode0
Deep Reinforcement Learning for General Video Game AICode0
A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware PerspectiveCode0
Generative Adversarial Network for Abstractive Text SummarizationCode0
Generic Itemset Mining Based on Reinforcement LearningCode0
Cold-Start Reinforcement Learning with Softmax Policy GradientCode0
Quantum Deep Reinforcement Learning for Robot Navigation TasksCode0
Generating Classical Chinese Poems from Vernacular ChineseCode0
General policy mapping: online continual reinforcement learning inspired on the insect brainCode0
Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement LearningCode0
Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural RewardsCode0
Queueing Network Controls via Deep Reinforcement LearningCode0
Generalized Speedy Q-learningCode0
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial StatesCode0
Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across DomainsCode0
Approximate Model-Based Shielding for Safe Reinforcement LearningCode0
Generalization Tower Network: A Novel Deep Neural Network Architecture for Multi-Task LearningCode0
Approximately Optimal Search on a Higher-dimensional Sliding PuzzleCode0
Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous FlightCode0
CODEX: A Cluster-Based Method for Explainable Reinforcement LearningCode0
Show:102550
← PrevPage 146 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified