SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1385113900 of 15113 papers

TitleStatusHype
Discourse-Aware Neural Rewards for Coherent Text Generation0
Deep Reinforcement Learning for Optimal Control of Space Heating0
End-to-End Reinforcement Learning for Automatic Taxonomy InductionCode0
Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control0
Reward Estimation for Variance Reduction in Deep Reinforcement LearningCode0
Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog0
FFNet: Video Fast-Forwarding via Reinforcement LearningCode0
Deep Reinforcement Learning for Page-wise Recommendations0
Planning and Learning with Stochastic Action Sets0
Multimodal Machine Translation with Reinforcement Learning0
Deep Reinforcement Learning for Playing 2.5D Fighting GamesCode0
Developing parsimonious ensembles using ensemble diversity within a reinforcement learning framework0
Exploration by Distributional Reinforcement Learning0
Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement LearningCode0
VINE: An Open Source Interactive Data Visualization Tool for NeuroevolutionCode0
A Reinforcement Learning Approach to Interactive-Predictive Neural Machine TranslationCode0
Robust Deep Reinforcement Learning for Security and Safety in Autonomous Vehicle Systems0
Robust Log-Optimal Strategy with Reinforcement Learning0
Falsification of Cyber-Physical Systems Using Deep Reinforcement Learning0
Dialog-based Interactive Image RetrievalCode0
Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming0
A Tree Search Algorithm for Sequence LabelingCode0
From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence PredictionCode0
Towards Experienced Anomaly Detector through Reinforcement Learning0
Sentiment Adaptive End-to-End Dialog Systems0
Deep Reinforcement Learning to Acquire Navigation Skills for Wheel-Legged Robots in Complex Environments0
Decoupling Dynamics and Reward for Transfer LearningCode0
Action Categorization for Computationally Improved Task Learning and Planning0
Multiagent Soft Q-Learning0
Towards Symbolic Reinforcement Learning with Common SenseCode0
Benchmarking projective simulation in navigation problems0
Distributed Distributional Deterministic Policy GradientsCode0
Crawling in Rogue's dungeons with (partitioned) A3CCode0
MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server0
Event Extraction with Generative Adversarial Imitation Learning0
PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making0
Cell Selection with Deep Reinforcement Learning in Sparse Mobile Crowdsensing0
Learning to Extract Coherent Summary via Deep Reinforcement Learning0
Lipschitz Continuity in Model-based Reinforcement LearningCode0
Disentangling Controllable and Uncontrollable Factors of Variation by Interacting with the World0
A Study on Overfitting in Deep Reinforcement LearningCode0
Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue SystemsCode0
Automated vehicle's behavior decision making using deep reinforcement learning and high-fidelity simulation environment0
Model-Free Linear Quadratic Control via Reduction to Expert Prediction0
On Improving Deep Reinforcement Learning for POMDPs0
State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning0
Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning0
CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++Code0
Robust Dual View Deep Agent0
Optimizing Query Evaluations using Reinforcement Learning for Web Search0
Show:102550
← PrevPage 278 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified