SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1250112550 of 15113 papers

TitleStatusHype
SoftCTRL: Soft conservative KL-control of Transformer Reinforcement Learning for Autonomous Driving0
Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL0
Soft Expert Reward Learning for Vision-and-Language Navigation0
Regularized Softmax Deep Multi-Agent Q-Learning0
Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning0
Soft policy optimization using dual-track advantage estimator0
Soft Q-Learning with Mutual-Information Regularization0
Soft-Robust Actor-Critic Policy-Gradient0
Soft-Robust Algorithms for Batch Reinforcement Learning0
SoK: Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning0
Solar Power driven EV Charging Optimization with Deep Reinforcement Learning0
SOLD: Slot Object-Centric Latent Dynamics Models for Relational Manipulation Learning from Pixels0
Solipsistic Reinforcement Learning0
SoloParkour: Constrained Reinforcement Learning for Visual Locomotion from Privileged Experience0
SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems0
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling0
Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network0
Solving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method0
Solving Bayesian inverse problems with diffusion priors and off-policy RL0
Solving Richly Constrained Reinforcement Learning through State Augmentation and Reward Penalties0
Solving Continual Combinatorial Selection via Deep Reinforcement Learning0
Solving Finite-Horizon MDPs via Low-Rank Tensors0
Solving Heterogeneous General Equilibrium Economic Models with Deep Reinforcement Learning0
Solving Math Word Problems with Double-Decoder Transformer0
Solving Multi-Goal Robotic Tasks with Decision Transformer0
Normalized Cut with Reinforcement Learning in Constrained Action Space0
Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning0
Solving optimal stopping problems with Deep Q-Learning0
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients0
Solving robust MDPs as a sequence of static RL problems0
Solving Rubik's Cube Without Tricky Sampling0
Solving single-objective tasks by preference multi-objective reinforcement learning0
Solving Sokoban with forward-backward reinforcement learning0
Solving Stochastic Games0
Solving the capacitated vehicle routing problem with timing windows using rollouts and MAX-SAT0
Solving the Order Batching and Sequencing Problem using Deep Reinforcement Learning0
Solving the single-track train scheduling problem via Deep Reinforcement Learning0
Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization0
Solving the swing-up and balance task for the Acrobot and Pendubot with SAC0
Solving the vehicle routing problem with deep reinforcement learning0
Some Supervision Required: Incorporating Oracle Policies in Reinforcement Learning via Epistemic Uncertainty Metrics0
SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning0
SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process0
Source-Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language0
So You Think You Can Scale Up Autonomous Robot Data Collection?0
Spacecraft Autonomous Decision-Planning for Collision Avoidance: a Reinforcement Learning Approach0
Space Navigator: a Tool for the Optimization of Collision Avoidance Maneuvers0
Space Processor Computation Time Analysis for Reinforcement Learning and Run Time Assurance Control Policies0
Sparse Adversarial Attack in Multi-agent Reinforcement Learning0
Show:102550
← PrevPage 251 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified