SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 69016950 of 15113 papers

TitleStatusHype
Learning Multiresolution Matrix Factorization and its Wavelet Networks on GraphsCode0
Robust Dynamic Bus Control: A Distributional Multi-agent Reinforcement Learning Approach0
OnSlicing: Online End-to-End Network Slicing with Reinforcement Learning0
Integrating Pretrained Language Model for Dialogue Policy Learning0
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics0
Learning Large Neighborhood Search Policy for Integer ProgrammingCode1
Rewards with Negative Examples for Reinforced Topic-Focused Abstractive Summarization0
A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition0
Feedback Attribution for Counterfactual Bandit Learning in Multi-Domain Spoken Language Understanding0
A Generative Framework for Simultaneous Machine Translation0
Neuro-Symbolic Approaches for Text-Based Policy LearningCode0
Learning Task Sampling Policy for Multitask Learning0
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure0
Learning to Operate an Electric Vehicle Charging Station Considering Vehicle-grid Integration0
Human-Level Control without Server-Grade HardwareCode0
Machine Learning aided Crop Yield Optimization0
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning0
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments0
An Actor-Critic Method for Simulation-Based Optimization0
Decentralized Multi-Agent Reinforcement Learning: An Off-Policy Method0
Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner0
Adjacency constraint for efficient hierarchical reinforcement learning0
A Decentralized Reinforcement Learning Framework for Efficient Passage of Emergency Vehicles0
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings0
Intrusion Prevention through Optimal StoppingCode1
Context Meta-Reinforcement Learning via NeuromodulationCode0
On Joint Learning for Solving Placement and Routing in Chip DesignCode1
Reinforced Workload Distribution Fairness0
Mixed Cooperative-Competitive Communication Using Multi-Agent Reinforcement Learning0
Learning to Communicate with Reinforcement Learning for an Adaptive Traffic Control System0
GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL0
Attacking Video Recognition Models with Bullet-Screen CommentsCode1
Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning0
Adaptive Discretization in Online Reinforcement Learning0
Data Informed Residual Reinforcement Learning for High-Dimensional Robotic Tracking Control0
Open Problem: Tight Online Confidence Intervals for RKHS Elements0
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision ProcessesCode0
Efficient Meta Subspace OptimizationCode0
URLB: Unsupervised Reinforcement Learning BenchmarkCode1
D2RLIR : an improved and diversified ranking function in interactive recommendation systems based on deep reinforcement learning0
An Adaptable Approach to Learn Realistic Legged Locomotion without Examples0
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives0
Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning0
Extracting Expert's Goals by What-if Interpretable Modeling0
Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning0
A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning0
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection0
Stabilising viscous extensional flows using Reinforcement LearningCode0
The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning0
Model based Multi-agent Reinforcement Learning with Tensor Decompositions0
Show:102550
← PrevPage 139 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified