SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1345113500 of 15113 papers

TitleStatusHype
Trial without Error: Towards Safe Reinforcement Learning via Human InterventionCode0
Pac-Man Pete: An extensible framework for building AI in VEX RoboticsCode0
Viability of Future Actions: Robust Safety in Reinforcement Learning via Entropy RegularizationCode0
PAC-Bayesian Soft Actor-Critic LearningCode0
Macro action selection with deep reinforcement learning in StarCraftCode0
PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement LearningCode0
To Measure or Not: A Cost-Sensitive, Selective Measuring Environment for Agricultural Management Decisions with Reinforcement LearningCode0
SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement LearningCode0
Real-World Dexterous Object Manipulation based Deep Reinforcement LearningCode0
Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual DataCode0
Neural SLAM: Learning to Explore with External MemoryCode0
Strangeness-driven Exploration in Multi-Agent Reinforcement LearningCode0
P3O: Policy-on Policy-off Policy OptimizationCode0
Real-time visual tracking by deep reinforced decision makingCode0
Real-Time Reinforcement LearningCode0
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free ControlCode0
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion PolicyCode0
Neural Sequence Model Training via α-divergence MinimizationCode0
Multiagent Inverse Reinforcement Learning via Theory of Mind ReasoningCode0
ToolRL: Reward is All Tool Learning NeedsCode0
Strategic Dialogue Management via Deep Reinforcement LearningCode0
Tools for Data-driven Modeling of Within-Hand Manipulation with Underactuated Adaptive HandsCode0
Neural Reward MachinesCode0
Multi-Agent Image Classification via Reinforcement LearningCode0
Neural Optimizer Search with Reinforcement LearningCode0
TorchBeast: A PyTorch Platform for Distributed RLCode0
TorchProteinLibrary: A computationally efficient, differentiable representation of protein structureCode0
Sample Complexity of Robust Reinforcement Learning with a Generative ModelCode0
Neural Operator based Reinforcement Learning for Control of first-order PDEs with Spatially-Varying State DelayCode0
TripleTree: A Versatile Interpretable Representation of Black Box Agents and their EnvironmentsCode0
Overcoming Exploration in Reinforcement Learning with DemonstrationsCode0
Real-time calibration of coherent-state receivers: learning by trial and errorCode0
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-TuningCode0
MINOS: Multimodal Indoor Simulator for Navigation in Complex EnvironmentsCode0
Real-Time Bidding by Reinforcement Learning in Display AdvertisingCode0
Real-time Adversarial Perturbations against Deep Reinforcement Learning Policies: Attacks and DefensesCode0
TrojDRL: Trojan Attacks on Deep Reinforcement Learning AgentsCode0
To the Max: Reinventing Reward in Reinforcement LearningCode0
Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal ControlCode0
Sample-Efficient Deep Reinforcement Learning via Episodic Backward UpdateCode0
ORSO: Accelerating Reward Design via Online Reward Selection and Policy OptimizationCode0
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision ProcessesCode0
Solving the Real Robot Challenge using Deep Reinforcement LearningCode0
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in VideosCode0
Orchestrated Value Mapping for Reinforcement LearningCode0
Neural Modular Control for Embodied Question AnsweringCode0
Model-Based Reinforcement Learning with Multi-Task Offline PretrainingCode0
USHER: Unbiased Sampling for Hindsight Experience ReplayCode0
Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting BehaviorCode0
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy CriticsCode0
Show:102550
← PrevPage 270 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified