SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1135111400 of 15113 papers

TitleStatusHype
Value Function Decomposition for Iterative Design of Reinforcement Learning Agents0
Instance-dependent _-bounds for policy evaluation in tabular reinforcement learning0
Value function interference and greedy action selection in value-based multi-objective reinforcement learning0
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning0
Dexterous In-hand Manipulation by Guiding Exploration with Simple Sub-skill Controllers0
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF0
Value of Information and Reward Specification in Active Inference and POMDPs0
Value Penalized Q-Learning for Recommender Systems0
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning0
Value Propagation Networks0
Value Pursuit Iteration0
Value Refinement Network (VRN)0
Value Summation: A Novel Scoring Function for MPC-based Model-based Reinforcement Learning0
VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL0
Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep Reinforcement Learning Approach0
Variable Gain Gradient Descent-based Reinforcement Learning for Robust Optimal Tracking Control of Uncertain Nonlinear System with Input-Constraints0
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks0
Variance-Aware Off-Policy Evaluation with Linear Function Approximation0
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs0
Variance-aware robust reinforcement learning with linear function approximation under heavy-tailed rewards0
Variance-Based Risk Estimations in Markov Processes via Transformation with State Lumping0
Variance Reduced Advantage Estimation with δ Hindsight Credit Assignment0
Variance-Reduced Conservative Policy Iteration0
Variance-Reduced Off-Policy Memory-Efficient Policy Search0
Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient0
Variance Reduction for Evolution Strategies via Structured Control Variates0
Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization0
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines0
Variance Reduction for Reinforcement Learning in Input-Driven Environments0
Variance Reduction Methods for Sublinear Reinforcement Learning0
Variational Adaptive-Newton Method for Explorative Learning0
Variational Bayes: A report on approaches and applications0
Variational Bayesian Reinforcement Learning with Regret Bounds0
Variational Constrained Reinforcement Learning with Application to Planning at Roundabout0
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning0
Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning0
Variational Inference for Model-Free and Model-Based Reinforcement Learning0
Variational Inference for Policy Gradient0
Variational Inference MPC for Bayesian Model-based Reinforcement Learning0
Variational Intrinsic Control Revisited0
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition0
Variational Meta Reinforcement Learning for Social Robotics0
Variational Model-based Policy Optimization0
Variational multiscale reinforcement learning for discovering reduced order closure models of nonlinear spatiotemporal transport systems0
Variational oracle guiding for reinforcement learning0
Variational Policy Gradient Method for Reinforcement Learning with General Utilities0
Variational quantum compiling with double Q-learning0
Parametrized quantum policies for reinforcement learning0
Policy Gradients using Variational Quantum Circuits0
Variational Quantum Reinforcement Learning via Evolutionary Optimization0
Show:102550
← PrevPage 228 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified