SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1310113150 of 15113 papers

TitleStatusHype
Temporal Regularization for Markov Decision ProcessCode0
Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning0
Transfer of Value Functions via Variational Methods0
The Importance of Sampling inMeta-Reinforcement Learning0
Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making0
Constrained Cross-Entropy Method for Safe Reinforcement Learning0
Data center cooling using model-predictive controlCode0
Genetic-Gated Networks for Deep Reinforcement Learning0
Distributed Multitask Reinforcement Learning with Quadratic Convergence0
Exponentially Weighted Imitation Learning for Batched Historical Data0
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning0
Geometrically Coupled Monte Carlo Sampling0
Learning Curriculum Policies for Reinforcement LearningCode0
Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL0
How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology0
Deep Multi-Agent Reinforcement Learning with Relevance GraphsCode0
An Introduction to Deep Reinforcement LearningCode1
BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning0
Modeling natural language emergence with integral transform theory and reinforcement learningCode0
Modulated Policy Hierarchies0
Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL0
Flow Shape Design for Microfluidic Devices Using Deep Reinforcement Learning0
A Structure-aware Online Learning Algorithm for Markov Decision Processes0
Deep Reinforcement Learning for Autonomous DrivingCode0
Deep Reinforcement Learning for Time Optimal Velocity Control using Prior Knowledge0
Trajectory-based Learning for Ball-in-Maze Games0
Unsupervised Control Through Non-Parametric Discriminative Rewards0
What is Interpretable? Using Machine Learning to Design Interpretable Decision-Support Systems0
Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning0
Quality-Aware Multimodal Saliency Detection via Deep Reinforcement Learning0
Understanding the impact of entropy on policy optimizationCode0
Automatic Face Aging in Videos via Deep Reinforcement Learning0
Distributed traffic light control at uncoupled intersections with real-world topology by deep reinforcement learning0
Grammars and reinforcement learning for molecule optimizationCode0
Learning State Representations in Complex Systems with Multimodal Data0
Genetic-Gated Networks for Deep Reinforcement0
PNS: Population-Guided Novelty Search for Reinforcement Learning in Hard Exploration Environments0
Environments for Lifelong Reinforcement LearningCode0
Reinforcement Learning for Uplift ModelingCode0
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation0
A Model-Based Reinforcement Learning Approach for a Rare Disease Diagnostic Task0
Learning to Activate Relay Nodes: Deep Reinforcement Learning Approach0
TorchProteinLibrary: A computationally efficient, differentiable representation of protein structureCode0
Model-Based Reinforcement Learning for Sepsis Treatment0
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement LearningCode0
Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays0
Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots0
High-Level Strategy Selection under Partial Observability in StarCraft: Brood War0
Urban Driving with Multi-Objective Deep Reinforcement LearningCode0
Neural Machine Translation with Adequacy-Oriented Learning0
Show:102550
← PrevPage 263 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified