SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 69516975 of 15113 papers

TitleStatusHype
Transfer learning with causal counterfactual reasoning in Decision Transformers0
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution NetworksCode1
APPTeK: Agent-Based Predicate Prediction in Temporal Knowledge Graphs0
Reinforcement Learning in Factored Action Spaces using Tensor Decompositions0
DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention0
Learning Domain Invariant Representations in Goal-conditioned Block MDPsCode1
Finite Horizon Q-learning: Stability, Convergence, Simulations and an application on Smart Grids0
A Subgame Perfect Equilibrium Reinforcement Learning Approach to Time-inconsistent Problems0
Enhancing Reinforcement Learning with discrete interfaces to learn the Dyck Language0
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical RepresentationsCode1
Comparing Heuristics, Constraint Optimization, and Reinforcement Learning for an Industrial 2D Packing Problem0
ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial MarketsCode1
Fragment-based Sequential Translation for Molecular Optimization0
Multi-Agent Advisor Q-LearningCode0
Towards Hyperparameter-free Policy Selection for Offline Reinforcement LearningCode0
The Difficulty of Passive Learning in Deep Reinforcement Learning0
Fault-Tolerant Federated Reinforcement Learning with Theoretical GuaranteeCode1
Landmark-Guided Subgoal Generation in Hierarchical Reinforcement LearningCode1
Distributed Multi-Agent Deep Reinforcement Learning Framework for Whole-building HVAC Control0
Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey0
Accelerating Distributed Deep Reinforcement Learning by In-Network Experience Sampling0
Learning Robust Controllers Via Probabilistic Model-Based Policy Search0
EnTRPO: Trust Region Policy Optimization Method with Entropy Regularization0
Learning to Simulate Self-Driven Particles System with Coordinated Policy OptimizationCode1
Average-Reward Learning and Planning with Options0
Show:102550
← PrevPage 279 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified