SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1105111075 of 15113 papers

TitleStatusHype
Training in Task Space to Speed Up and Guide Reinforcement Learning0
Training Language Models to Critique With Multi-agent Feedback0
Training Large Language Models to Reason via EM Policy Gradient0
Training Larger Networks for Deep Reinforcement Learning0
Training like Playing: A Reinforcement Learning And Knowledge Graph-based framework for building Automatic Consultation System in Medical Field0
LeDex: Training LLMs to Better Self-Debug and Explain Code0
Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators0
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior0
Trajectory-based Learning for Ball-in-Maze Games0
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning0
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability0
Trajectory First: A Curriculum for Discovering Diverse Policies0
Trajectory Modeling via Random Utility Inverse Reinforcement Learning0
Trajectory Optimization for Unknown Constrained Systems using Reinforcement Learning0
Trajectory Planning with Deep Reinforcement Learning in High-Level Action Spaces0
Trajectory representation learning for Multi-Task NMRDPs planning0
Trajectory Tracking of Underactuated Sea Vessels With Uncertain Dynamics: An Integral Reinforcement Learning Approach0
Trajectory-wise Iterative Reinforcement Learning Framework for Auto-bidding0
TrajGen: Generating Realistic and Diverse Trajectories with Reactive and Feasible Agent Behaviors for Autonomous Driving0
TransDreamer: Reinforcement Learning with Transformer World Models0
Transferable Curricula through Difficulty Conditioned Generators0
Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications0
Transferable Latent-to-Latent Locomotion Policy for Efficient and Versatile Motion Control of Diverse Legged Robots0
Transferable Multi-Agent Reinforcement Learning with Dynamic Participating Agents0
Distributional Successor Features Enable Zero-Shot Policy Optimization0
Show:102550
← PrevPage 443 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified