SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1065110675 of 15113 papers

TitleStatusHype
Explicit Explore, Exploit, or Escape (E^4): near-optimal safety-constrained reinforcement learning in polynomial time0
Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation0
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation0
Explicit Pareto Front Optimization for Constrained Reinforcement Learning0
Explicit Planning for Efficient Exploration in Reinforcement Learning0
Explicit Recall for Efficient Exploration0
Explicit User Manipulation in Reinforcement Learning Based Recommender Systems0
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning0
Exploiting Contextual Structure to Generate Useful Auxiliary Tasks0
Exploiting Deep Reinforcement Learning for Edge Caching in Cell-Free Massive MIMO Systems0
Exploiting Environmental Variation to Improve Policy Robustness in Reinforcement Learning0
Exploiting Estimation Bias in Clipped Double Q-Learning for Continous Control Reinforcement Learning Tasks0
Exploiting generalisation symmetries in accuracy-based learning classifier systems: An initial study0
Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations0
Exploiting generalization in the subspaces for faster model-based learning0
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL0
Facilitating Sim-to-real by Intrinsic Stochasticity of Real-Time Simulation in Reinforcement Learning for Robot Manipulation0
Exploiting Language Instructions for Interpretable and Compositional Reinforcement Learning0
Exploiting Noisy Data in Distant Supervision Relation Classification0
Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning0
Exploiting Symbolic Heuristics for the Synthesis of Domain-Specific Temporal Planning Guidance using Reinforcement Learning0
Exploiting the potential of deep reinforcement learning for classification tasks in high-dimensional and unstructured data0
Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning0
Exploration and Incentives in Reinforcement Learning0
Exploration by Distributional Reinforcement Learning0
Show:102550
← PrevPage 427 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified