SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 62016225 of 15113 papers

TitleStatusHype
Learn Quasi-stationary Distributions of Finite State Markov Chain0
Learn to Earn: Enabling Coordination within a Ride Hailing Fleet0
Learn to Exceed: Stereo Inverse Reinforcement Learning with Concurrent Policy Optimization0
Learn To Manage Portfolio With Reinforcement Learning0
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets0
Learn to Play Tetris with Deep Reinforcement Learning0
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains0
Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem0
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning0
Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning0
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator0
On the Model-Misspecification in Reinforcement Learning0
Lecture Notes on Partially Known MDPs0
LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games0
Score vs. Winrate in Score-Based Games: which Reward for Reinforcement Learning?0
Lessons from reinforcement learning for biological representations of space0
Lessons Learned from Data-Driven Building Control Experiments: Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement Learning0
Less Suboptimal Learning and Control in Variational POMDPs0
Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability0
Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking0
Cognitive Level-k Meta-Learning for Safe and Pedestrian-Aware Autonomous Driving0
Leverage the Average: an Analysis of KL Regularization in RL0
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning0
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains0
Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis0
Show:102550
← PrevPage 249 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified