SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1390113950 of 15113 papers

TitleStatusHype
Differentiable lower bound for expected BLEU scoreCode0
Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect InformationCode0
Online Reinforcement Learning in Non-Stationary Context-Driven EnvironmentsCode0
DeepTPI: Test Point Insertion with Deep Reinforcement LearningCode0
Hierarchical Reinforcement Learning with Optimal Level Synchronization based on a Deep Generative ModelCode0
Hierarchical Reinforcement Learning with Advantage-Based Auxiliary RewardsCode0
Koopman Spectrum Nonlinear Regulators and Efficient Online LearningCode0
Differentially Private Regret Minimization in Episodic Markov Decision ProcessesCode0
Autonomous robotic nanofabrication with reinforcement learningCode0
A Deep Reinforcement Learning Framework For Column GenerationCode0
Hierarchical Reinforcement Learning with the MAXQ Value Function DecompositionCode0
Deep TAMER: Interactive Agent Shaping in High-Dimensional State SpacesCode0
A Meta Reinforcement Learning Approach for Predictive Autoscaling in the CloudCode0
Fairness Through Counterfactual UtilitiesCode0
FairStream: Fair Multimedia Streaming Benchmark for Reinforcement Learning AgentsCode0
DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement LearningCode0
A Meta-MDP Approach to Exploration for Lifelong Reinforcement LearningCode0
Deep Successor Reinforcement LearningCode0
Deep Spatial Autoencoders for Visuomotor LearningCode0
Autonomous Option Invention for Continual Hierarchical Reinforcement Learning and PlanningCode0
Action-Attentive Deep Reinforcement Learning for Autonomous Alignment of BeamlinesCode0
DeepSim: A Reinforcement Learning Environment Build Toolkit for ROS and GazeboCode0
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion ModelsCode0
Deep RTS: A Game Environment for Deep Reinforcement Learning in Real-Time Strategy GamesCode0
Deep reinforcement learning with time-scale invariant memoryCode0
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue SystemsCode0
Deep Reinforcement Learning with Swin TransformersCode0
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based GamesCode0
Conjugated Discrete Distributions for Distributional Reinforcement LearningCode0
Graph-based State Representation for Deep Reinforcement LearningCode0
Deep Reinforcement Learning with Function Properties in Mean Reversion StrategiesCode0
Deep Reinforcement Learning with Feedback-based ExplorationCode0
Fast, Accurate and Lightweight Super-Resolution with Neural Architecture SearchCode0
Hierarchical Text Generation and Planning for Strategic DialogueCode0
Deep Reinforcement Learning with a Natural Language Action SpaceCode0
L2Explorer: A Lifelong Reinforcement Learning Assessment EnvironmentCode0
Digital Twin Aided Channel Estimation: Zone-Specific Subspace Prediction and CalibrationCode0
Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement LearningCode0
Confidence Aware Inverse Constrained Reinforcement LearningCode0
L2SR: Learning to Sample and Reconstruct for Accelerated MRI via Reinforcement LearningCode0
Laboratory Experiments of Model-based Reinforcement Learning for Adaptive Optics ControlCode0
A Machine with Short-Term, Episodic, and Semantic Memory SystemsCode0
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement LearningCode0
Fast deep reinforcement learning using online adjustments from the pastCode0
Learning Light Transport the Reinforced WayCode0
Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement LearningCode0
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement LearningCode0
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit ThreadsCode0
LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient QueryingCode0
Integrating Reinforcement Learning, Action Model Learning, and Numeric Planning for Tackling Complex TasksCode0
Show:102550
← PrevPage 279 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified