SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1415114200 of 15113 papers

TitleStatusHype
Markov Decision Processes with Continuous Side Information0
Variational Adaptive-Newton Method for Explorative Learning0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
Finding Efficient Swimming Strategies in a Three Dimensional Chaotic Flow by Reinforcement Learning0
Costate-focused models for reinforcement learning0
Saliency-based Sequential Image Attention with Multiset Prediction0
Loss Functions for Multiset Prediction0
Reinforcement Learning in a large scale photonic Recurrent Neural Network0
Classical Structured Prediction Losses for Sequence to Sequence Learning0
SQLNet: Generating Structured Queries From Natural Language Without Reinforcement LearningCode2
Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection0
Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive SummarisationCode0
Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization0
Applications of Deep Learning and Reinforcement Learning to Biological Data0
An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks0
Worm-level Control through Search-based Reinforcement Learning0
LatentPoison - Adversarial Attacks On The Latent SpaceCode0
Energy Storage Arbitrage in Real-Time Markets via Reinforcement Learning0
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?Code0
Double Q(σ) and Q(σ, λ): Unifying Reinforcement Learning Control Algorithms0
Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning0
Policy Optimization by Genetic Distillation0
Adaptive coordination of working-memory and reinforcement learning in non-human primates performing a trial-and-error problem solving taskCode0
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning0
Automatic Text Summarization Using Reinforcement Learning with Embedding Features0
Learning to Diagnose: Assimilating Clinical Narratives using Deep Reinforcement Learning0
Intelligent Parameter Tuning in Optimization-based Iterative CT Reconstruction via Deep Reinforcement Learning0
Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning0
Paraphrase Generation with Deep Reinforcement Learning0
Regret Minimization for Partially Observable Deep Reinforcement LearningCode0
Backpropagation through the Void: Optimizing control variates for black-box gradient estimationCode0
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement LearningCode0
Visualizing and Understanding Atari AgentsCode0
Automata-Guided Hierarchical Reinforcement Learning for Skill Composition0
Learning Robust Rewards with Adversarial Inverse Reinforcement LearningCode1
Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming0
Exponential improvements for quantum-accessible reinforcement learning0
Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning ApproachCode0
Action-depedent Control Variates for Policy Optimization via Stein's IdentityCode0
Eigenoption Discovery through the Deep Successor RepresentationCode1
Artificial Intelligence as Structural Estimation: Economic Interpretations of Deep Blue, Bonanza, and AlphaGo0
Sequence-to-Sequence ASR Optimization via Reinforcement Learning0
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning0
Distributional Reinforcement Learning with Quantile RegressionCode0
Inverse Reinforcement Learning Under Noisy Observations0
Generalization Tower Network: A Novel Deep Neural Network Architecture for Multi-Task LearningCode0
Learning Approximate Stochastic Transition ModelsCode0
Accelerated Reinforcement Learning0
Exploiting generalization in the subspaces for faster model-based learning0
Insulin Regimen ML-based control for T2DM patients0
Show:102550
← PrevPage 284 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified