SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 59516000 of 15113 papers

TitleStatusHype
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning0
Learning in Mean Field Games: A Survey0
Learning medical triage from clinicians using Deep Q-Learning0
Learning Memory-Dependent Continuous Control from Demonstrations0
Data-Driven Merton's Strategies via Policy Randomization0
Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning0
Learning Mobile Robot Navigation in the Dense Crowd with Deep Reinforcement Learning0
Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer0
Decision Making in Monopoly using a Hybrid Deep Reinforcement Learning Approach0
Learning Montezuma's Revenge from a Single Demonstration0
Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance0
Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning0
Learning Natural Language Generation from Scratch0
Learning Navigation Behaviors End-to-End with AutoRL0
Learning Near Optimal Policies with Low Inherent Bellman Error0
Learning Not to Spoof0
Learning objects from pixels0
Learning offline: memory replay in biological and artificial reinforcement learning0
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration0
Learning on Abstract Domains: A New Approach for Verifiable Guarantee in Reinforcement Learning0
Learning Online Policies for Person Tracking in Multi-View Environments0
Learning on the Job: Long-Term Behavioural Adaptation in Human-Robot Interactions0
Learning Open Domain Multi-hop Search Using Reinforcement Learning0
Learning Optimal Deterministic Policies with Stochastic Policy Gradients0
Learning Optimal Strategies for Temporal Tasks in Stochastic Games0
Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space0
Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning0
Learning Options from Demonstration using Skill Segmentation0
Learning over All Stabilizing Nonlinear Controllers for a Partially-Observed Linear System0
Learning Parsimonious Dynamics for Generalization in Reinforcement Learning0
Learning Partially Observable Deterministic Action Models0
Learning Perception-Aware Agile Flight in Cluttered Environments0
Learning Personalized Discretionary Lane-Change Initiation for Fully Autonomous Driving Based on Reinforcement Learning0
Learning Personalized Human-Aware Robot Navigation Using Virtual Reality Demonstrations from a User Study0
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning0
Learning Physics Priors for Deep Reinforcement Learing0
Learning Plasma Dynamics and Robust Rampdown Trajectories with Predict-First Experiments at TCV0
Learning Policy Representations in Multiagent Systems0
Learning Polynomial Representations of Physical Objects with Application to Certifying Correct Packing Configurations0
Learning Power Control from a Fixed Batch of Data0
Learning Practical Communication Strategies in Cooperative Multi-Agent Reinforcement Learning0
Learning Predictive Communication by Imagination in Networked System Control0
Learning predictive representations in autonomous driving to improve deep reinforcement learning0
Learning Predictive Safety Filter via Decomposition of Robust Invariant Set0
Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning0
Learning proposals for sequential importance samplers using reinforced variational inference0
Learning Proxemic Behavior Using Reinforcement Learning with Cognitive Agents0
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning0
Learning Quadruped Locomotion Policies using Logical Rules0
Learning Realistic Traffic Agents in Closed-loop0
Show:102550
← PrevPage 120 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified