SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 89519000 of 15113 papers

TitleStatusHype
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs0
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning0
Deep Reinforcement Learning in Quantitative Algorithmic Trading: A ReviewCode0
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning0
Reducing the Deployment-Time Inference Control Costs of Deep Reinforcement Learning Agents via an Asymmetric Architecture0
Shaped Policy Search for Evolutionary Strategies using Waypoints0
On the Theory of Reinforcement Learning with Once-per-Episode Feedback0
Predictive Representation Learning for Language Modeling0
Gradient-Free Neural Network Training via Synaptic-Level Reinforcement Learning0
A Survey of Deep Reinforcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles0
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm0
Reconfigurable Intelligent Surface-assisted Multi-UAV Networks: Efficient Resource Allocation with Deep Reinforcement Learning0
Learning Approximate and Exact Numeral Systems via Reinforcement Learning0
A nearly Blackwell-optimal policy gradient methodCode0
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model0
Risk-Aware Transfer in Reinforcement Learning using Successor Features0
Task-Guided Inverse Reinforcement Learning Under Partial Information0
Reinforcement Learning reveals fundamental limits on the mixing of active particles0
Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications0
Stochastic Intervention for Causal Inference via Reinforcement Learning0
Reinforcement Learning for on-line Sequence Transformation0
Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence Optimization0
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching0
Branching Dueling Q-Network Based Online Scheduling of a Microgrid With Distributed Energy Storage Systems0
Adversarial Intrinsic Motivation for Reinforcement LearningCode0
A Modular and Transferable Reinforcement Learning Framework for the Fleet Rebalancing Problem0
Context-aware taxi dispatching at city-scale using deep reinforcement learning0
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement LearningCode0
Safe Model-based Off-policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles0
Transfer Learning and Curriculum Learning in Sokoban0
Unbiased Asymmetric Reinforcement Learning under Partial Observability0
Trajectory Modeling via Random Utility Inverse Reinforcement Learning0
Towards Scalable Verification of Deep Reinforcement LearningCode0
KnowSR: Knowledge Sharing among Homogeneous Agents in Multi-agent Reinforcement Learning0
A Generalised Inverse Reinforcement Learning Framework0
Bayesian Nonparametric Reinforcement Learning in LTE and Wi-Fi Coexistence0
A Comparison of Reward Functions in Q-Learning Applied to a Cart Position ProblemCode0
Interpretable UAV Collision Avoidance using Deep Reinforcement Learning0
FNAS: Uncertainty-Aware Fast Neural Architecture Search0
IGO-QNN: Quantum Neural Network Architecture for Inductive Grover Oracularization0
Verification of Dissipativity and Evaluation of Storage Function in Economic Nonlinear MPC using Q-Learning0
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence0
Room Clearance with Feudal Hierarchical Reinforcement Learning0
An Efficient Application of Neuroevolution for Competitive Multiagent LearningCode0
Attention-based Reinforcement Learning for Real-Time UAV Semantic Communication0
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning0
An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap0
Certification of Iterative Predictions in Bayesian Neural NetworksCode0
De-Biased Modelling of Search Click Behavior with Reinforcement Learning0
Rule Augmented Unsupervised Constituency ParsingCode0
Show:102550
← PrevPage 180 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified