SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 53015350 of 15113 papers

TitleStatusHype
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space0
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management0
Reinforcement Learning-based Control of Nonlinear Systems using Carleman Approximation: Structured and Unstructured Designs0
MAC-PO: Multi-Agent Experience Replay via Collective Priority OptimizationCode0
Robust Auto-landing Control of an agile Regional Jet Using Fuzzy Q-learning0
Towards a Sustainable Internet-of-Underwater-Things based on AUVs, SWIPT, and Reinforcement Learning0
BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT0
Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning0
Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes0
Constrained Reinforcement Learning for Predictive Control in Real-Time Stochastic Dynamic Optimal Power Flow0
Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning0
Adversarial Model for Offline Reinforcement Learning0
Deep Reinforcement Learning for Robotic Pushing and Picking in Cluttered Environment0
A Reinforcement Learning Framework for Online Speaker Diarization0
Backstepping Temporal Difference Learning0
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue SystemsCode0
Differentiable Arbitrating in Zero-sum Markov Games0
DC4L: Distribution Shift Recovery via Data-Driven Control for Deep Learning ModelsCode0
Reinforcement Learning with Function Approximation: From Linear to Nonlinear0
Safe Deep Reinforcement Learning by Verifying Task-Level Properties0
Multiagent Inverse Reinforcement Learning via Theory of Mind ReasoningCode0
Robust and Versatile Bipedal Jumping Control through Reinforcement Learning0
Generalization in Visual Reinforcement Learning with the Reward Sequence DistributionCode0
Compositionality and Bounds for Optimal Value Functions in Reinforcement Learning0
Interactive Video Corpus Moment Retrieval using Reinforcement Learning0
AutoDOViz: Human-Centered Automation for Decision Optimization0
Auto.gov: Learning-based Governance for Decentralized Finance (DeFi)Code0
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare0
Effective Multimodal Reinforcement Learning with Modality Alignment and Importance Enhancement0
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization0
Reinforcement Learning in the Wild with Maximum Likelihood-based Model Transfer0
Promoting Cooperation in Multi-Agent Reinforcement Learning via Mutual Help0
Natural Language-conditioned Reinforcement Learning with Inside-out Task Language Development and Translation0
Post Reinforcement Learning InferenceCode0
Robot path planning using deep reinforcement learning0
Deep Reinforcement Learning for mmWave Initial Beam Alignment0
A State Augmentation based approach to Reinforcement Learning from Human Preferences0
Mixed Traffic Control and Coordination from Pixels0
Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning0
Learning to Forecast Aleatoric and Epistemic Uncertainties over Long Horizon Trajectories0
Data Driven Reward Initialization for Preference based Reinforcement Learning0
Tuning computer vision models with task rewards0
Quantum Computing Provides Exponential Regret Improvement in Episodic Reinforcement Learning0
Optimal Sample Complexity of Reinforcement Learning for Mixing Discounted Markov Decision Processes0
Meta-Reinforcement Learning via Exploratory Task Clustering0
Prioritized offline Goal-swapping Experience Replay0
Reinforcement Learning Based Power Grid Day-Ahead Planning and AI-Assisted Control0
Scalable Multi-Agent Reinforcement Learning with General Utilities0
CERiL: Continuous Event-based Reinforcement Learning0
Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications0
Show:102550
← PrevPage 107 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified