SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1325113300 of 15113 papers

TitleStatusHype
Quality-Aware Multimodal Saliency Detection via Deep Reinforcement Learning0
Understanding the impact of entropy on policy optimizationCode0
Automatic Face Aging in Videos via Deep Reinforcement Learning0
Distributed traffic light control at uncoupled intersections with real-world topology by deep reinforcement learning0
Grammars and reinforcement learning for molecule optimizationCode0
Learning State Representations in Complex Systems with Multimodal Data0
PNS: Population-Guided Novelty Search for Reinforcement Learning in Hard Exploration Environments0
Environments for Lifelong Reinforcement LearningCode0
Genetic-Gated Networks for Deep Reinforcement0
Reinforcement Learning for Uplift ModelingCode0
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation0
A Model-Based Reinforcement Learning Approach for a Rare Disease Diagnostic Task0
Learning to Activate Relay Nodes: Deep Reinforcement Learning Approach0
Model-Based Reinforcement Learning for Sepsis Treatment0
TorchProteinLibrary: A computationally efficient, differentiable representation of protein structureCode0
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement LearningCode0
Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots0
High-Level Strategy Selection under Partial Observability in StarCraft: Brood War0
Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays0
Urban Driving with Multi-Objective Deep Reinforcement LearningCode0
Neural Machine Translation with Adequacy-Oriented Learning0
Model Learning for Look-ahead Exploration in Continuous ControlCode0
Reinforcement Learning of Active Vision for Manipulating Objects under OcclusionsCode0
Scalable agent alignment via reward modeling: a research directionCode0
Simulated Autonomous Driving in a Realistic Driving Environment using Deep Reinforcement Learning and a Deterministic Finite State Machine0
Reinforcement Learning with A* and a Deep HeuristicCode0
Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 20
Measurement-based adaptation protocol with quantum reinforcement learning in a Rigetti quantum computer0
Energy Efficiency in Reinforcement Learning for Wireless Sensor Networks0
Learning Actionable Representations with Goal-Conditioned PoliciesCode0
Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning0
Policy Optimization with Model-based Explorations0
Recursive Sparse Pseudo-input Gaussian Process SARSA0
Parameter Sharing Reinforcement Learning Architecture for Multi Agent Driving Behaviors0
Autonomous Extraction of a Hierarchical Structure of Tasks in Reinforcement Learning, A Sequential Associate Rule Mining Approach0
Emergence of linguistic conventions in multi-agent reinforcement learning0
Improving Automatic Source Code Summarization via Deep Reinforcement LearningCode0
Intervention Aided Reinforcement Learning for Safe and Practical Policy Optimization in Navigation0
Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks0
Reward learning from human preferences and demonstrations in AtariCode0
Tight Bayesian Ambiguity Sets for Robust MDPs0
Orthogonal Policy Gradient and Autonomous Driving Application0
The Utility of Sparse Representations for Control in Reinforcement Learning0
Natural Environment Benchmarks for Reinforcement LearningCode0
Bayesian Reinforcement Learning in Factored POMDPs0
Large-scale Interactive Recommendation with Tree-structured Policy Gradient0
Emergence of Addictive Behaviors in Reinforcement Learning Agents0
Deep Q learning for fooling neural networksCode0
Image Captioning Based on a Hierarchical Attention Mechanism and Policy Gradient Optimization0
Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG0
Show:102550
← PrevPage 266 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified