SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 59015925 of 15113 papers

TitleStatusHype
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks0
Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning0
Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fall with Grace0
Learning from Peers: Deep Transfer Reinforcement Learning for Joint Radio and Cache Resource Allocation in 5G RAN Slicing0
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models0
Learning from Simulation, Racing in Reality0
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network0
Learning from Symmetry: Meta-Reinforcement Learning with Symmetrical Behaviors and Language Instructions0
Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate0
Learning Functionally Decomposed Hierarchies for Continuous Navigation Tasks0
Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning0
Learning Gaussian Policies from Smoothed Action Value Functions0
Learning Generalizable Agents via Saliency-Guided Features Decorrelation0
Learning Generalized Wireless MAC Communication Protocols via Abstraction0
Learning General-Purpose Controllers via Locally Communicating Sensorimotor Modules0
Learning General World Models in a Handful of Reward-Free Deployments0
Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards0
Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing Analytic Experts0
Learning Good Policies By Learning Good Perceptual Models0
Learning Good Representation via Continuous Attention0
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning0
Learning Heuristics for Automated Reasoning through Reinforcement Learning0
Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning0
Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences0
Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning0
Show:102550
← PrevPage 237 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified