SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 13011325 of 15113 papers

TitleStatusHype
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
Evening the Score: Targeting SARS-CoV-2 Protease Inhibition in Graph Generative Models for Therapeutic CandidatesCode1
Explaining Autonomous Driving Actions with Visual Question AnsweringCode1
Fast Population-Based Reinforcement Learning on a Single MachineCode1
How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via f-Advantage RegressionCode1
Avalon: A Benchmark for RL Generalization Using Procedurally Generated WorldsCode1
Conservative Offline Distributional Reinforcement LearningCode1
Avalanche RL: a Continual Reinforcement Learning LibraryCode1
An Experimental Design Perspective on Model-Based Reinforcement LearningCode1
A Crash Course on Reinforcement LearningCode1
Active Inference for Stochastic ControlCode1
ERL-Re^2: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy RepresentationCode1
Constrained Policy Optimization via Bayesian World ModelsCode1
Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-InteractionCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
Constrained Variational Policy Optimization for Safe Reinforcement LearningCode1
Constrained Update Projection Approach to Safe Policy OptimizationCode1
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand SystemsCode1
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
Contextualized Rewriting for Text SummarizationCode1
Constructions in combinatorics via neural networksCode1
Contention Window Optimization in IEEE 802.11ax Networks with Deep Reinforcement LearningCode1
A Reinforcement Learning Approach for Rebalancing Electric Vehicle Sharing SystemsCode1
ICU-Sepsis: A Benchmark MDP Built from Real Medical DataCode1
A Workflow for Offline Model-Free Robotic Reinforcement LearningCode1
Show:102550
← PrevPage 53 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified