SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1230112350 of 15113 papers

TitleStatusHype
Proximal Distilled Evolutionary Reinforcement LearningCode0
Optimal Use of Experience in First Person Shooter Environments0
Modern Deep Reinforcement Learning AlgorithmsCode0
Ranking Policy GradientCode0
Neural networks with motivation0
Reinforcement Learning-Based Trajectory Design for the Aerial Base Stations0
A neurally plausible model learns successor representations in partially observable environmentsCode0
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and NeuropsychiatryCode1
Split Q Learning: Reinforcement Learning with Two-Stream RewardsCode1
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning0
Reinforcement Learning with Convex ConstraintsCode1
Shaping Belief States with Generative Environment Models for RL0
A Study of State Aliasing in Structured Prediction with RNNs0
Disentangled Skill Embeddings for Reinforcement Learning0
Leveraging Reinforcement Learning Techniques for Effective Policy Adoption and Validation0
Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction0
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach0
A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement LearningCode0
Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning0
Cooperative Lane Changing via Deep Reinforcement Learning0
A Deep Reinforcement Learning Approach for Global RoutingCode0
Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine LearningCode0
When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework0
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks0
Unsupervised Learning of Object Keypoints for Perception and ControlCode1
Experience Replay Optimization0
Calibrated Model-Based Deep Reinforcement LearningCode0
Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study0
Wasserstein Adversarial Imitation Learning0
When to Trust Your Model: Model-Based Policy OptimizationCode1
Reward Prediction Error as an Exploration Objective in Deep RL0
Multi-user Resource Control with Deep Reinforcement Learning in IoT Edge Computing0
Directed Exploration for Reinforcement Learning0
Hill Climbing on Value Estimates for Search-control in Dyna0
Language as an Abstraction for Hierarchical Deep Reinforcement LearningCode0
Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning0
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination0
Robust Reinforcement Learning for Continuous Control with Model Misspecification0
Towards White-box Benchmarks for Algorithm Control0
Sample-efficient Adversarial Imitation Learning from Observation0
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration0
Universal Successor Features Based Deep Reinforcement Learning for Navigation0
A Joint Planning and Learning Framework for Human-Aided Decision-Making0
LPaintB: Learning to Paint from Self-Supervision0
Iterative Model-Based Reinforcement Learning Using Simulations in the Differentiable Neural Computer0
Learning-Driven Exploration for Reinforcement LearningCode0
A gray-box approach for curriculum learning0
MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement LearningCode1
Reinforcement Learning Driven Heuristic Optimization0
Reinforcement Learning with Non-uniform State Representations for Adaptive Search0
Show:102550
← PrevPage 247 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified