SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1320113250 of 15113 papers

TitleStatusHype
Composing Entropic Policies using Divergence Correction0
The effects of negative adaptation in Model-Agnostic Meta-Learning0
Relative Entropy Regularized Policy IterationCode0
Playing Text-Adventure Games with Graph-Based Deep Reinforcement LearningCode0
Hyperbolic Embeddings for Learning Options in Hierarchical Reinforcement Learning0
Learning Vine Copula Models For Synthetic Data Generation0
Exploration versus exploitation in reinforcement learning: a stochastic control approach0
Generative Adversarial Self-Imitation Learning0
FoldingZero: Protein Folding from Scratch in Hydrophobic-Polar Model0
Deep Reinforcement Learning for Intelligent Transportation Systems0
Bach2Bach: Generating Music Using A Deep Reinforcement Learning Approach0
Resource Constrained Deep Reinforcement Learning0
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations0
Towards Solving Text-based Games by Producing Adaptive Action SpacesCode0
Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic ControlCode0
Mitigating Planner Overfitting in Model-Based Reinforcement Learning0
Revisiting the Softmax Bellman Operator: New Benefits and New PerspectiveCode0
Macro action selection with deep reinforcement learning in StarCraftCode0
Simple random search of static linear policies is competitive for reinforcement learningCode0
Transfer of Value Functions via Variational Methods0
Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach0
The Importance of Sampling inMeta-Reinforcement Learning0
Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning0
Temporal Regularization for Markov Decision ProcessCode0
Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making0
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis0
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning0
Learning Curriculum Policies for Reinforcement LearningCode0
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning0
Data center cooling using model-predictive controlCode0
Exponentially Weighted Imitation Learning for Batched Historical Data0
Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL0
Genetic-Gated Networks for Deep Reinforcement Learning0
Constrained Cross-Entropy Method for Safe Reinforcement Learning0
Distributed Multitask Reinforcement Learning with Quadratic Convergence0
Geometrically Coupled Monte Carlo Sampling0
BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning0
Deep Multi-Agent Reinforcement Learning with Relevance GraphsCode0
How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology0
Modeling natural language emergence with integral transform theory and reinforcement learningCode0
Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL0
Modulated Policy Hierarchies0
Flow Shape Design for Microfluidic Devices Using Deep Reinforcement Learning0
A Structure-aware Online Learning Algorithm for Markov Decision Processes0
Deep Reinforcement Learning for Autonomous DrivingCode0
Deep Reinforcement Learning for Time Optimal Velocity Control using Prior Knowledge0
Unsupervised Control Through Non-Parametric Discriminative Rewards0
Trajectory-based Learning for Ball-in-Maze Games0
What is Interpretable? Using Machine Learning to Design Interpretable Decision-Support Systems0
Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning0
Show:102550
← PrevPage 265 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified