SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 37013750 of 15113 papers

TitleStatusHype
Q-Cogni: An Integrated Causal Reinforcement Learning Framework0
Exponential Hardness of Reinforcement Learning with Linear Function Approximation0
Limited Query Graph Connectivity Test0
A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors0
On Bellman's principle of optimality and Reinforcement learning for safety-constrained Markov decision process0
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning0
The Dormant Neuron Phenomenon in Deep Reinforcement LearningCode6
GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual ExplanationsCode1
Neural Laplace Control for Continuous-time Delayed SystemsCode1
EvoTorch: Scalable Evolutionary Computation in PythonCode3
Model-Based Uncertainty in Value FunctionsCode1
GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification0
Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains0
Multi-Agent Reinforcement Learning with Common Policy for Antenna Tilt Optimization0
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationCode0
AC2C: Adaptively Controlled Two-Hop Communication for Multi-Agent Reinforcement Learning0
Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs0
Energy Harvesting Reconfigurable Intelligent Surface for UAV Based on Robust Deep Reinforcement LearningCode1
To the Noise and Back: Diffusion for Shared Autonomy0
Diverse Policy Optimization for Structured Action SpaceCode1
Concept Learning for Interpretable Multi-Agent Reinforcement Learning0
Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMsCode1
Behavior Proximal Policy OptimizationCode1
Towards Decentralized Predictive Quality of Service in Next-Generation Vehicular Networks0
Provably Efficient Reinforcement Learning via Surprise Bound0
Constrained Reinforcement Learning using Distributional Representation for Trustworthy Quadrotor UAV Tracking ControlCode0
Self-supervised network distillation: an effective approach to exploration in sparse reward environmentsCode0
BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT0
Reinforcement Learning for Block Decomposition of CAD Models0
Assessment of Reinforcement Learning for Macro PlacementCode2
A Reinforcement Learning Framework for Online Speaker Diarization0
Adversarial Model for Offline Reinforcement Learning0
Robust Auto-landing Control of an agile Regional Jet Using Fuzzy Q-learning0
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management0
MAC-PO: Multi-Agent Experience Replay via Collective Priority OptimizationCode0
Deep Reinforcement Learning for Robotic Pushing and Picking in Cluttered Environment0
Towards a Sustainable Internet-of-Underwater-Things based on AUVs, SWIPT, and Reinforcement Learning0
Learning to Play Text-based Adventure Games with Maximum Entropy Reinforcement LearningCode0
Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes0
Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning0
Minimax-Bayes Reinforcement LearningCode0
Constrained Reinforcement Learning for Predictive Control in Real-Time Stochastic Dynamic Optimal Power Flow0
Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning0
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space0
Reinforcement Learning-based Control of Nonlinear Systems using Carleman Approximation: Structured and Unstructured Designs0
UAV Path Planning Employing MPC- Reinforcement Learning Method Considering Collision Avoidance0
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret0
Multiagent Inverse Reinforcement Learning via Theory of Mind ReasoningCode0
Deep Reinforcement Learning for Cost-Effective Medical DiagnosisCode1
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue SystemsCode0
Show:102550
← PrevPage 75 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified