SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1360113650 of 15113 papers

TitleStatusHype
Virtual Replay CacheCode0
Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement LearningCode0
Model-Free Episodic ControlCode0
Exploration Policies for On-the-Fly Controller Synthesis: A Reinforcement Learning ApproachCode0
Suphx: Mastering Mahjong with Deep Reinforcement LearningCode0
Neural Improvement Heuristics for Graph Combinatorial Optimization ProblemsCode0
Efficient Parallel Reinforcement Learning Framework using the Reactor ModelCode0
Scaling Laws for a Multi-Agent Reinforcement Learning ModelCode0
Towards Better Interpretability in Deep Q-NetworksCode0
Neural Episodic ControlCode0
Optimizing Differentiable Relaxations of Coreference Evaluation MetricsCode0
Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement LearningCode0
Query Focused Multi-document Summarisation of Biomedical Texts: Macquarie Universiy and the Australian National University at BioASQ8bCode0
MinAtar: An Atari-Inspired Testbed for Thorough and Reproducible Reinforcement Learning ExperimentsCode0
Query Focused Multi-document Summarisation of Biomedical TextsCode0
Surprising Negative Results for Generative Adversarial Tree SearchCode0
Towards biologically plausible Dreaming and Planning in recurrent spiking networksCode0
Virtual to Real Reinforcement Learning for Autonomous DrivingCode0
Optimized Recommender Systems with Deep Reinforcement LearningCode0
Budgeted Reinforcement Learning in Continuous State SpaceCode0
Using reinforcement learning to find an optimal set of featuresCode0
Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement LearningCode0
Surveillance Evasion Through Bayesian Reinforcement LearningCode0
Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep Reinforcement LearningCode0
Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-meansCode0
Using reinforcement learning to improve drone-based inference of greenhouse gas fluxesCode0
Using reinforcement learning to learn how to play text-based gamesCode0
Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement LearningCode0
Mo' States Mo' Problems: Emergency Stop Mechanisms from ObservationCode0
Scheduled Policy Optimization for Natural Language Communication with Intelligent AgentsCode0
MASAI: Multi-agent Summative Assessment Improvement for Unsupervised Environment DesignCode0
Visceral Machines: Risk-Aversion in Reinforcement Learning with Intrinsic Physiological RewardsCode0
Query-based Targeted Action-Space Adversarial Policies on Deep Reinforcement Learning AgentsCode0
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline PolicyCode0
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement LearningCode0
Mildly Constrained Evaluation Policy for Offline Reinforcement LearningCode0
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight GuaranteesCode0
Quantum reinforcement learningCode0
Optimistic Linear Support and Successor Features as a Basis for Optimal Policy TransferCode0
Neural Architecture Search with Reinforcement LearningCode0
MOSEAC: Streamlined Variable Time Step Reinforcement LearningCode0
SVRG for Policy Evaluation with Fewer Gradient EvaluationsCode0
Optimistic Distributionally Robust Policy OptimizationCode0
Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FELCode0
Quantum Deep Reinforcement Learning for Robot Navigation TasksCode0
ScrofaZero: Mastering Trick-taking Poker Game Gongzhu by Deep Reinforcement LearningCode0
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement LearningCode0
Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement LearningCode0
Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RLCode0
Machine Teaching for Inverse Reinforcement Learning: Algorithms and ApplicationsCode0
Show:102550
← PrevPage 273 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified