SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1137611400 of 15113 papers

TitleStatusHype
Learning State Abstractions for Transfer in Continuous ControlCode0
A data-driven choice of misfit function for FWI using reinforcement learning0
Analyzing Policy Distillation on Multi-Task Learning and Meta-Reinforcement Learning in Meta-World0
Description Based Text Classification with Reinforcement Learning0
Causally Correct Partial Models for Reinforcement Learning0
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation0
Accelerating Reinforcement Learning for Reaching using Continuous Curriculum Learning0
Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning0
Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts0
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces0
Manipulating Reinforcement Learning: Poisoning Attacks on Cost Signals0
Reward-Free Exploration for Reinforcement Learning0
Student/Teacher Advising through Reward Augmentation0
Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting0
Temporal-adaptive Hierarchical Reinforcement Learning0
Social diversity and social preferences in mixed-motive reinforcement learning0
Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning0
Deep Radial-Basis Value Functions for Continuous Control0
Learning Task-Driven Control Policies via Information Bottlenecks0
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise0
Bootstrapping a DQN Replay Memory with Synthetic Experiences0
Policy Gradient based Quantum Approximate Optimization Algorithm0
Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes0
Evolutionary algorithms for constructing an ensemble of decision trees0
Deep Reinforcement Learning for Autonomous Driving: A Survey0
Show:102550
← PrevPage 456 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified