SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1395114000 of 15113 papers

TitleStatusHype
Setting up a Reinforcement Learning Task with a Real-World RobotCode0
Composable Deep Reinforcement Learning for Robotic ManipulationCode0
Automated Curriculum Learning by Rewarding Temporally Rare EventsCode0
Neural Text Generation: Past, Present and Beyond0
Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning0
Measurement-based adaptation protocol with quantum reinforcement learning0
Automated Speed and Lane Change Decision Making using Deep Reinforcement Learning0
Imitation Learning with Concurrent Actions in 3D Games0
Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies0
Learning to Explore with Meta-Policy Gradient0
Active Reinforcement Learning with Monte-Carlo Tree Search0
Policy Search in Continuous Action Domains: an Overview0
Soft-Robust Actor-Critic Policy-Gradient0
Deep reinforcement learning for time series: playing idealized trading gamesCode0
Kickstarting Deep Reinforcement Learning0
Variance Networks: When Expectation Does Not Meet Your ExpectationsCode0
SA-IGA: A Multiagent Reinforcement Learning Method Towards Socially Optimal Outcomes0
DeepCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling0
Feudal Reinforcement Learning for Dialogue Management in Large Domains0
A Multi-Objective Deep Reinforcement Learning Framework0
A Brandom-ian view of Reinforcement Learning towards strong-AI0
Extracting Action Sequences from Texts Based on Deep Reinforcement Learning0
Intent-aware Multi-agent Reinforcement Learning0
Personalized Exposure Control Using Adaptive Metering and Reinforcement Learning0
Smoothed Action Value Functions for Learning Gaussian Policies0
Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement LearningCode0
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs0
OIL: Observational Imitation Learning0
Some Considerations on Learning to Explore via Meta-Reinforcement LearningCode0
Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and ApplicationCode0
Model-Free Control for Distributed Stream Data Processing using Deep Reinforcement Learning0
Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach0
On Oracle-Efficient PAC RL with Rich Observations0
Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling0
Deep Reinforcement Learning for Sponsored Search Real-time Bidding0
Hierarchical Imitation and Reinforcement Learning0
Deep Reinforcement Learning for Join Order Enumeration0
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy MethodsCode0
Learning by Playing - Solving Sparse Reward Tasks from ScratchCode0
Model-Ensemble Trust-Region Policy OptimizationCode0
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning0
Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising0
The Mirage of Action-Dependent Baselines in Reinforcement LearningCode0
DiGrad: Multi-Task Reinforcement Learning with Shared Actions0
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingCode0
Modeling Others using Oneself in Multi-Agent Reinforcement Learning0
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for ResearchCode0
Variance Reduction Methods for Sublinear Reinforcement Learning0
Reinforcement and Imitation Learning for Diverse Visuomotor SkillsCode0
Temporal Difference Models: Model-Free Deep RL for Model-Based Control0
Show:102550
← PrevPage 280 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified