SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 87018750 of 15113 papers

TitleStatusHype
Reinforced Hybrid Genetic Algorithm for the Traveling Salesman Problem0
Offline reinforcement learning with uncertainty for treatment strategies in sepsis0
Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning0
Attend2Pack: Bin Packing through Deep Reinforcement Learning with Attention0
Learning Interaction-aware Guidance Policies for Motion Planning in Dense Traffic Scenarios0
Aligning an optical interferometer with beam divergence control and continuous action spaceCode0
Policy Gradient Methods for Distortion Risk Measures0
CLAIM: Curriculum Learning Policy for Influence Maximization in Unknown Social Networks0
Adaptive Stress Testing for Adversarial Learning in a Financial Environment0
Automated Gain Control Through Deep Reinforcement Learning for Downstream Radar Object Detection0
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning0
Enhancing Video Analytics Accuracy via Real-time Automated Camera Parameter Tuning0
Adaptation of Quadruped Robot Locomotion with Meta-Learning0
Computational Benefits of Intermediate Rewards for Goal-Reaching Policy LearningCode0
Sublinear Regret for Learning POMDPs0
Towards Autonomous Pipeline Inspection with Hierarchical Reinforcement Learning0
Quadruped Locomotion on Non-Rigid Terrain using Reinforcement Learning0
Pseudo-Model-Free Hedging for Variable Annuities via Deep Reinforcement Learning0
Federated Model Search via Reinforcement Learning0
Learning Time-Invariant Reward Functions through Model-Based Inverse Reinforcement Learning0
DORA: Toward Policy Optimization for Task-oriented Dialogue System with Efficient ContextCode0
A Unified Off-Policy Evaluation Approach for General Value Function0
A Short Note on the Relationship of Information Gain and Eluder Dimension0
Meta-Reinforcement Learning for Heuristic Planning0
The Least Restriction for Offline Reinforcement Learning0
Winning at Any Cost -- Infringing the Cartel Prohibition With Reinforcement Learning0
Gradient Importance Learning for Incomplete ObservationsCode0
A Review of Explainable Artificial Intelligence in Manufacturing0
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement LearningCode0
Control of rough terrain vehicles using deep reinforcement learning0
Low Dimensional State Representation Learning with Robotics Priors in Continuous Action Spaces0
Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics0
Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning0
Traffic Signal Control with Communicative Deep Reinforcement Learning Agents: a Case Study0
Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement LearningCode0
Examining average and discounted reward optimality criteria in reinforcement learning0
Controlled Interacting Particle Algorithms for Simulation-based Reinforcement LearningCode0
A Novel Deep Reinforcement Learning Based Stock Direction Prediction using Knowledge Graph and Community Aware Sentiments0
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning0
Reinforcement Learning for Feedback-Enabled Cyber Resilience0
RL-NCS: Reinforcement learning based data-driven approach for nonuniform compressed sensingCode0
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents0
Optimal Power Allocation for Rate Splitting Communications with Deep Reinforcement Learning0
MHER: Model-based Hindsight Experience Replay0
Model Mediated Teleoperation with a Hand-Arm Exoskeleton in Long Time Delays Using Reinforcement Learning0
Blending Task Success and User Satisfaction: Analysis of Learned Dialogue Behaviour with Multiple Rewards0
Goal-Conditioned Reinforcement Learning with Imagined Subgoals0
Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning0
Decomposing the Prediction Problem; Autonomous Navigation by neoRL Agents0
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT0
Show:102550
← PrevPage 175 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified