SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1180111850 of 15113 papers

TitleStatusHype
Learning from Trajectories via Subgoal DiscoveryCode0
Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints0
On Solving the 2-Dimensional Greedy Shooter Problem for UAVsCode0
Neural Topic Model with Reinforcement Learning0
Situated GAIL: Multitask imitation using task-conditioned adversarial inverse reinforcement learning0
Positive-Unlabeled Reward Learning0
Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning0
DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning0
Generalized Speedy Q-learningCode0
Generating Formality-Tuned Summaries Using Input-Dependent Rewards0
Exploring Diverse Expressions for Paraphrase Generation0
Frequentist Regret Bounds for Randomized Least-Squares Value IterationCode0
A2: Extracting Cyclic Switchings from DOB-nets for Rejecting Excessive Disturbances0
Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning0
Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference0
Explicit Explore-Exploit Algorithms in Continuous State SpacesCode0
Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension0
Cascaded LSTMs based Deep Reinforcement Learning for Goal-driven DialogueCode0
DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering0
Hierarchical Expert Networks for Meta-Learning0
VASE: Variational Assorted Surprise Exploration for Reinforcement Learning0
RLINK: Deep Reinforcement Learning for User Identity Linkage0
RBED: Reward Based Epsilon Decay0
Policy Continuation with Hindsight Inverse DynamicsCode0
Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer Architecture0
A Distributed Model-Free Algorithm for Multi-hop Ride-sharing using Deep Reinforcement Learning0
DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning0
Deep Reinforcement Learning for Distributed Uncoordinated Cognitive Radios Resource Allocation0
Adaptive Sampling Quasi-Newton Methods for Derivative-Free Stochastic Optimization0
Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality0
Deep Decentralized Reinforcement Learning for Cooperative Control0
Overcoming Catastrophic Interference in Online Reinforcement Learning with Dynamic Self-Organizing Maps0
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and ExperimentsCode0
Feedback Linearization for Unknown Systems via Reinforcement Learning0
Constrained Reinforcement Learning Has Zero Duality Gap0
Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization0
Biomimetic Ultra-Broadband Perfect Absorbers Optimised with Reinforcement Learning0
Asynchronous Methods for Model-Based Reinforcement LearningCode0
Certified Adversarial Robustness for Deep Reinforcement Learning0
Generalization in Reinforcement Learning with Selective Noise Injection and Information BottleneckCode0
Quantum enhancements for deep reinforcement learning in large spacesCode0
Entity Abstraction in Visual Model-Based Reinforcement LearningCode0
Model-Free Mean-Field Reinforcement Learning: Mean-Field MDP and Mean-Field Q-Learning0
Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control0
Minimax Weight and Q-Function Learning for Off-Policy Evaluation0
Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear OrderCode0
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement LearningCode0
Convergent Policy Optimization for Safe Reinforcement LearningCode0
Reinforcement Learning-Enabled Reliable Wireless Sensor Networks in Dynamic Underground Environments0
ZPD Teaching Strategies for Deep Reinforcement Learning from DemonstrationsCode0
Show:102550
← PrevPage 237 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified