SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 38763900 of 15113 papers

TitleStatusHype
DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement LearningCode0
Deep TAMER: Interactive Agent Shaping in High-Dimensional State SpacesCode0
BadRL: Sparse Targeted Backdoor Attack Against Reinforcement LearningCode0
DeepTPI: Test Point Insertion with Deep Reinforcement LearningCode0
DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic NavigationCode0
CFlowNets: Continuous Control with Generative Flow NetworksCode0
Deep Transfer Reinforcement Learning for Text SummarizationCode0
Answers Unite! Unsupervised Metrics for Reinforced Summarization ModelsCode0
Deep Variational Reinforcement Learning for POMDPsCode0
Flight Controller Synthesis Via Deep Reinforcement LearningCode0
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute DetectionCode0
Flexible Option LearningCode0
Free energy-based reinforcement learning using a quantum processorCode0
Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement LearningCode0
Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement LearningCode0
Certification of Iterative Predictions in Bayesian Neural NetworksCode0
Flappy Hummingbird: An Open Source Dynamic Simulation of Flapping Wing Robots and AnimalsCode0
Defending Observation Attacks in Deep Reinforcement Learning via Detection and DenoisingCode0
Semantic RL with Action Grammars: Data-Efficient Learning of Hierarchical Task AbstractionsCode0
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement LearningCode0
Centralized Model and Exploration Policy for Multi-Agent RLCode0
Fleet Control using Coregionalized Gaussian Process Policy IterationCode0
CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement LearningCode0
Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement LearningCode0
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation ProblemCode0
Show:102550
← PrevPage 156 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified