SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97269750 of 15113 papers

TitleStatusHype
MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for Answer Selection0
Deep Reinforcement Learning for Asset Allocation in US Equities0
Graph Convolutional Value Decomposition in Multi-Agent Reinforcement LearningCode1
Characterizing Policy Divergence for Personalized Meta-Reinforcement Learning0
Parameterized Reinforcement Learning for Optical System Optimization0
Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic EnvironmentsCode0
Deep RL With Information Constrained Policies: Generalization in Continuous Control0
Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning0
LaND: Learning to Navigate from DisengagementsCode1
Jointly-Learned State-Action Embedding for Efficient Reinforcement Learning0
Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning0
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
Learning Intrinsic Symbolic Rewards in Reinforcement Learning0
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer LearningCode1
Maximum Reward Formulation In Reinforcement LearningCode0
Trajectory Inspection: A Method for Iterative Clinician-Driven Design of Reinforcement Learning StudiesCode1
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and BaselinesCode1
Nonstationary Reinforcement Learning with Linear Function Approximation0
Provable Fictitious Play for General Mean-Field Games0
Information-Driven Adaptive Sensing Based on Deep Reinforcement LearningCode0
Actor-Critic Algorithm for High-dimensional Partial Differential Equations0
Reinforcement Learning for Many-Body Ground-State Preparation Inspired by Counterdiabatic Driving0
Regularized Inverse Reinforcement Learning0
Online Safety Assurance for Deep Reinforcement Learning0
Variational Intrinsic Control Revisited0
Show:102550
← PrevPage 390 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified