SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 62766300 of 15113 papers

TitleStatusHype
LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs0
LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning0
Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework0
Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning0
Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning0
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning0
Local Differential Privacy for Regret Minimization in Reinforcement Learning0
Local Environment Poisoning Attacks on Federated Reinforcement Learning0
LocalEscaper: A Weakly-supervised Framework with Regional Reconstruction for Scalable Neural TSP Solvers0
Local Explanations for Reinforcement Learning0
Local Feature Swapping for Generalization in Reinforcement Learning0
Local-Guided Global: Paired Similarity Representation for Visual Reinforcement Learning0
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning0
Localized Observation Abstraction Using Piecewise Linear Spatial Decay for Reinforcement Learning in Combat Simulations0
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition0
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs0
Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving0
Locally Constrained Representations in Reinforcement Learning0
Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes0
Locally Private Distributed Reinforcement Learning0
Local Navigation and Docking of an Autonomous Robot Mower using Reinforcement Learning and Computer Vision0
Local Nonstationarity for Efficient Bayesian Optimization0
Local Pairwise Distance Matching for Backpropagation-Free Reinforcement Learning0
Local Policy Optimization for Trajectory-Centric Reinforcement Learning0
Local Search for Policy Iteration in Continuous Control0
Show:102550
← PrevPage 252 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified