SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 66016625 of 15113 papers

TitleStatusHype
Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise0
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient0
Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation0
Time Adaptive Reinforcement Learning0
Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning0
Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning0
Time-Scale Separation in Q-Learning: Extending TD() for Action-Value Function Decomposition0
Time-Variant Variational Transfer for Value Functions0
Time your hedge with Deep Reinforcement Learning0
Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints0
Timing Process Interventions with Causal Inference and Reinforcement Learning0
tinyMAN: Lightweight Energy Manager using Reinforcement Learning for Energy Harvesting Wearable IoT Devices0
To Beam Or Not To Beam: That is a Question of Cooperation for Language GANs0
To bootstrap or to rollout? An optimal and adaptive interpolation0
To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies0
Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents0
Together We Rise: Optimizing Real-Time Multi-Robot Task Allocation using Coordinated Heterogeneous Plays0
Toggling a Genetic Switch Using Reinforcement Learning0
Token-Efficient RL for LLM Reasoning0
Token-Mol 1.0: Tokenized drug design with large language model0
Tolerance of Reinforcement Learning Controllers against Deviations in Cyber Physical Systems0
TOMA: Topological Map Abstraction for Reinforcement Learning0
Toolpath design for additive manufacturing using deep reinforcement learning0
Topic-Preserving Synthetic News Generation: An Adversarial Deep Reinforcement Learning Approach0
To Risk or Not to Risk: Learning with Risk Quantification for IoT Task Offloading in UAVs0
Show:102550
← PrevPage 265 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified