SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1110111150 of 15113 papers

TitleStatusHype
Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement LearningCode0
An Inductive Bias for Distances: Neural Nets that Respect the Triangle InequalityCode1
Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning0
Robust Reinforcement Learning via Adversarial training with Langevin DynamicsCode0
Hoplite: Efficient and Fault-Tolerant Collective Communication for Task-Based Distributed SystemsCode1
Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks0
Fast Reinforcement Learning for Anti-jamming Communications0
Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic0
Effective Reinforcement Learning through Evolutionary Surrogate-Assisted PrescriptionCode1
MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization0
Multi-Vehicle Routing Problems with Soft Time Windows: A Multi-Agent Reinforcement Learning Approach0
A Tensor Network Approach to Finite Markov Decision Processes0
Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing0
Regret Bounds for Discounted MDPs0
On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement LearningCode0
Towards Intelligent Pick and Place Assembly of Individualized Products Using Reinforcement Learning0
Objective Mismatch in Model-based Reinforcement LearningCode1
Machine Learning Approaches For Motor Learning: A Short Review0
Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial OptimizationCode1
Learning to Switch Among Agents in a Team via 2-Layer Markov Decision Processes0
HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem0
Learning Structured Communication for Multi-agent Reinforcement Learning0
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning0
On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning and SLAM Based Approach0
Proficiency Constrained Multi-Agent Reinforcement Learning for Environment-Adaptive Multi UAV-UGV Teaming0
Provable Self-Play Algorithms for Competitive Reinforcement Learning0
On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning0
SparseIDS: Learning Packet Sampling with Reinforcement LearningCode1
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions0
Discrete Action On-Policy Learning with Action-Value CriticCode0
Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction StatesCode1
Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons0
A Deep Reinforcement Learning Algorithm Using Dynamic Attention Model for Vehicle Routing ProblemsCode1
Analyzing Policy Distillation on Multi-Task Learning and Meta-Reinforcement Learning in Meta-World0
Learning State Abstractions for Transfer in Continuous ControlCode0
BRPO: Batch Residual Policy Optimization0
Conservative Exploration in Reinforcement Learning0
A data-driven choice of misfit function for FWI using reinforcement learning0
Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning0
Description Based Text Classification with Reinforcement Learning0
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning0
Multi-task Reinforcement Learning with a Planning Quasi-Metric0
Manipulating Reinforcement Learning: Poisoning Attacks on Cost Signals0
Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts0
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation0
Causally Correct Partial Models for Reinforcement Learning0
Accelerating Reinforcement Learning for Reaching using Continuous Curriculum Learning0
Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning0
Reward-Free Exploration for Reinforcement Learning0
Student/Teacher Advising through Reward Augmentation0
Show:102550
← PrevPage 223 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified