SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 64516500 of 15113 papers

TitleStatusHype
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems0
Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics0
Active Reinforcement Learning -- A Roadmap Towards Curious Classifier Systems for Self-Adaptation0
In Defense of the Unitary Scalarization for Deep Multi-Task LearningCode1
STIR^2: Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks0
Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making0
Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph0
State of the Art of User Simulation approaches for conversational information retrieval0
Opportunities of Hybrid Model-based Reinforcement Learning for Cell Therapy Manufacturing Process Control0
When is Offline Two-Player Zero-Sum Markov Game Solvable?0
Verified Probabilistic Policies for Deep Reinforcement LearningCode1
A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning0
Assessing Policy, Loss and Planning Combinations in Reinforcement Learning using a New Modular Architecture0
Mirror Learning: A Unifying Framework of Policy OptimisationCode1
Neural Network Optimization for Reinforcement Learning Tasks Using Sparse Computations0
Offline Reinforcement Learning for Road Traffic Control0
SABLAS: Learning Safe Control for Black-box Dynamical SystemsCode1
Combining Reinforcement Learning and Inverse Reinforcement Learning for Asset Allocation Recommendations0
Sample Efficient Deep Reinforcement Learning via Uncertainty EstimationCode1
Offsetting Unequal Competition through RL-assisted Incentive Schemes0
Using Simulation Optimization to Improve Zero-shot Policy Transfer of QuadrotorsCode1
Deep Reinforcement Learning, a textbook0
Deep Learning-based Predictive Control of Battery Management for Frequency RegulationCode0
Learning Complex Spatial Behaviours in ABM: An Experimental Observational Study0
Analyzing Micro-Founded General Equilibrium Models with Many Agents using Deep Reinforcement Learning0
A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning0
Execute Order 66: Targeted Data Poisoning for Reinforcement Learning0
Actor-Critic Network for Q&A in an Adversarial Environment0
Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanismCode1
Robust Algorithmic Collusion0
Toward Causal-Aware RL: State-Wise Action-Refined Temporal DifferenceCode0
Reinforcement Learning for Task Specifications with Action-Constraints0
Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification0
Symmetry-Aware Neural Architecture for Embodied Visual Exploration0
Joint Learning-Based Stabilization of Multiple Unknown Linear Systems0
A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning0
Toward Pareto Efficient Fairness-Utility Trade-off inRecommendation through Reinforcement Learning0
Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning0
Transfer RL across Observation Feature Spaces via Model-Based Regularization0
Stochastic convex optimization for provably efficient apprenticeship learning0
Using Graph-Aware Reinforcement Learning to Identify Winning Strategies in Diplomacy Games (Student Abstract)0
Single-Shot Pruning for Offline Reinforcement Learning0
Robust Entropy-regularized Markov Decision Processes0
SimSR: Simple Distance-based State Representation for Deep Reinforcement LearningCode1
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement LearningCode0
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning0
Stability-Preserving Automatic Tuning of PID Control with Reinforcement Learning0
Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates0
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation0
Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates0
Show:102550
← PrevPage 130 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified