SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1245112500 of 15113 papers

TitleStatusHype
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives0
Ranking Policy GradientCode0
Proximal Distilled Evolutionary Reinforcement LearningCode0
Modern Deep Reinforcement Learning AlgorithmsCode0
Optimal Use of Experience in First Person Shooter Environments0
Event-Driven Models0
Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals0
Deep Conservative Policy Iteration0
Inverse reinforcement learning conditioned on brain scan0
A Theoretical Connection Between Statistical Physics and Reinforcement Learning0
Neural networks with motivation0
Reinforcement Learning-Based Trajectory Design for the Aerial Base Stations0
A neurally plausible model learns successor representations in partially observable environmentsCode0
Disentangled Skill Embeddings for Reinforcement Learning0
Leveraging Reinforcement Learning Techniques for Effective Policy Adoption and Validation0
Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction0
A Study of State Aliasing in Structured Prediction with RNNs0
Shaping Belief States with Generative Environment Models for RL0
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning0
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks0
Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine LearningCode0
When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework0
Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning0
A Deep Reinforcement Learning Approach for Global RoutingCode0
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach0
A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement LearningCode0
Cooperative Lane Changing via Deep Reinforcement Learning0
Calibrated Model-Based Deep Reinforcement LearningCode0
Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study0
Experience Replay Optimization0
Multi-user Resource Control with Deep Reinforcement Learning in IoT Edge Computing0
Wasserstein Adversarial Imitation Learning0
Reward Prediction Error as an Exploration Objective in Deep RL0
Robust Reinforcement Learning for Continuous Control with Model Misspecification0
Sample-efficient Adversarial Imitation Learning from Observation0
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration0
Towards White-box Benchmarks for Algorithm Control0
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination0
Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning0
Hill Climbing on Value Estimates for Search-control in Dyna0
Directed Exploration for Reinforcement Learning0
Language as an Abstraction for Hierarchical Deep Reinforcement LearningCode0
A gray-box approach for curriculum learning0
Learning-Driven Exploration for Reinforcement LearningCode0
Iterative Model-Based Reinforcement Learning Using Simulations in the Differentiable Neural Computer0
A Joint Planning and Learning Framework for Human-Aided Decision-Making0
Universal Successor Features Based Deep Reinforcement Learning for Navigation0
LPaintB: Learning to Paint from Self-Supervision0
Reinforcement Learning Driven Heuristic Optimization0
Reinforcement Learning with Non-uniform State Representations for Adaptive Search0
Show:102550
← PrevPage 250 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified