SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1390113950 of 15113 papers

TitleStatusHype
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations0
Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning0
DORA The Explorer: Directed Outreaching Reinforcement Action-SelectionCode0
Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel InputCode0
Market Making via Reinforcement LearningCode0
Universal Successor Representations for Transfer Reinforcement Learning0
A clustering-based reinforcement learning approach for tailored personalization of e-Health interventions0
Outline Objects using Deep Reinforcement Learning0
Binary Space Partitioning as Intrinsic Reward0
Crafting a Toolchain for Image Restoration by Deep Reinforcement LearningCode0
Gotta Learn Fast: A New Benchmark for Generalization in RLCode0
Latent Space Policies for Hierarchical Reinforcement Learning0
Hierarchical Modular Reinforcement Learning Method and Knowledge Acquisition of State-Action Rule for Multi-target Problem0
Scalable Sentiment for Sequence-to-sequence Chatbot Response with Performance Analysis0
Programmatically Interpretable Reinforcement Learning0
End-to-End Learning of Communications Systems Without a Channel ModelCode0
A Human Mixed Strategy Approach to Deep Reinforcement Learning0
Information Maximizing Exploration with a Latent Dynamics Model0
EmoRL: Continuous Acoustic Emotion Classification using Deep Reinforcement Learning0
StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer LearningCode0
Renewal Monte Carlo: Renewal theory based reinforcement learning0
Recall Traces: Backtracking Models for Efficient Reinforcement Learning0
Curiosity-driven Exploration for Mapless Navigation with Deep Reinforcement Learning0
Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environmentsCode0
Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning0
Learning to Navigate in Cities Without a MapCode0
Snap Angle Prediction for 360^ Panoramas0
Towards Learning Transferable Conversational Skills using Multi-dimensional Dialogue ModellingCode0
Deep Reinforcement Learning for Traffic Light Control in Vehicular NetworksCode0
How an Electrical Engineer Became an Artificial Intelligence Researcher, a Multiphase Active Contours Analysis0
Unsupervised Predictive Memory in a Goal-Directed AgentCode0
Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system0
Reinforcement Learning for Fair Dynamic Pricing0
Deep Communicating Agents for Abstractive Summarization0
Forward-Backward Reinforcement Learning0
Scalable photonic reinforcement learning by time-division multiplexing of laser chaos0
Autonomous Ramp Merge Maneuver Based on Reinforcement Learning with Continuous Action Space0
The Importance of Constraint Smoothness for Parameter Estimation in Computational Cognitive Modeling0
Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation0
DOP: Deep Optimistic Planning with Approximate Value Function Evaluation0
Learning State Representations for Query Optimization with Deep Reinforcement Learning0
Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft0
Neuronal Circuit PoliciesCode0
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language NavigationCode0
End-to-End Video Captioning with Multitask Reinforcement LearningCode0
Learning Robotic Assembly from CAD0
Meta Reinforcement Learning with Latent Variable Gaussian Processes0
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines0
Optimizing Sponsored Search Ranking Strategy by Deep Reinforcement Learning0
Natural Gradient Deep Q-learning0
Show:102550
← PrevPage 279 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified